Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Genomics 



RESEARCH ARTICLE Open Access 



Genome-wide analysis of WRKY gene family in 
Cucumis sativus 

Jian Ling, Weijie Jiang", Ying Zhang, Hongjun Yu, Zhenchuan Mao, Xingfang Gu, Sanwen Huang and Bingyan Xie 
Abstract 

Background: WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in 
many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. 
Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication 
of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY 
proteins, and to compare these positively identified proteins with their homologs in model plants, such as 
Arobidopsis. 

Results: We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their 
encoded proteins, the cucumber WRKY {CsWRKY) genes were classified into three groups (group 1-3). Analysis of 
expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their 
transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were 
differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile 
of stress-inducible CsWRKY genes were correlated with those of their putative Arobidopsis WRKY (AtWRKY) orthologs, 
except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under 
positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or 
positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of 
group 3 orthologs. 

Conclusions: Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their 
expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 
WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional 
divergence of these genes. 



Background 

Transcription factors exhibit sequence-specific DNA- 
binding and are capable of activating or repressing tran- 
scription of downstream target genes. In plants, WRKY 
proteins constitute a large family of transcription factors 
that are involved in various physiological processes. Pro- 
teins in this family contain at least one highly conserved 
signature domain of about 60 amino acid residues, 
which includes the conserved WRKYGQK sequence fol- 
lowed by a zinc finger motif, located in the C-terminal 
region [1]. The WRKY domain facilitates binding of the 
proteins to the W box or the SURE (sugar-responsive 
cis-element) in the promoter regions of target genes 



* Correspondence: jiangwj@mail.caas.net.cn; xieby@mail.caas.net.cn 
Institute of Vegetables and Flowers, Chinese Academy of Agricultural 
Sciences, 12 Zhongguancun South Street, Beijing, 100081 China 

(3 BioMed Central 



[2,3]. As deduced from nuclear magnetic resonance 
(NMR) analysis of the C-terminal WRKY domain of 
Arabidopsis WRKY4 (AtWRKY4), the conserved 
WRKYGQK sequence of WRKY domains is directly 
involved in DNA binding [4]. WRKY proteins can be 
classified into three groups (1, 2 and 3) based on the 
number of WRKY domains and the pattern of the zinc- 
finger motif. Group 1 proteins typically contain two 
WRKY domains including a C2H2 motif. Group 2 pro- 
teins have a single WRKY domain and a C2H2 zinc-fin- 
ger motif and can be further divided into five subgroups 
(2a-2e) based on the phylogeny of the WRKY domains. 
Group 3 proteins also have a single WRKY domain, but 
their zinc-finger-like motif is C2-H-C [1]. 

Since the cloning of the first cDNA encoding a WRKY 
protein, SPF1 from sweet potato [5], a large number of 



© 201 1 Ling et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons 
Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in 
any medium, provided the original work is properly cited. 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 2 of 20 



WRKY proteins have been experimentally identified 
from several plant species [6-17], and have been shown 
to be involved in various physiological processes under 
normal growth conditions and under various stress con- 
dition [18]. It has been well documented that WRKY 
proteins play a key role in plant defense against various 
biotic stresses including bacterial, fungal and viral 
pathogens [19-27]. They also play important regulatory 
roles in developmental processes, such as trichome 
initiation [28], embryo morphogenesis [29], senescence 
[30], and some signal transduction processes mediated 
by plant hormones such as gibberellic acid [31], abscisic 
acid [32,33] or salicylic acid [34]. There is also accumu- 
lating evidence that WRKY proteins are involved in 
responses to various abiotic stresses. In Arabidopsis, 
microarray analyses have revealed that some of the 
WRKY transcripts are strongly regulated in response to 
various abiotic stresses, such as salinity, drought and 
cold [35-37]. In rice, under abiotic stresses (cold, 
drought and salinity) or various phytohormone treat- 
ments, 54 WRKY genes showed significant differences 
in their transcript abundance [18]. In barley, a WRKY 
gene, Hv-WRKY38, is expressed in response to cold and 
drought stress response [38] while in soybean at least 
nine WRKY genes are found to be differentially 
expressed under abiotic stress [15]. 

Because of their extensive involvement in various phy- 
siological processes, it is likely that the WRKY family in 
angiosperms has expanded greatly during evolution. 
There are at least 72 WRKY family members in Arabi- 
dopsis [1] and at least 109 in rice [17]. Gene duplication 
events have played a critical role in the expansion of 
WRKY genes. For example, in rice, 80% of WRKY genes 
loci are located in duplicated regions [18]. Gene duplica- 
tion events can lead to the generation of new WRKY 
genes. It is worth noting that the three groups of 
WRKY genes appeared at different times during evolu- 
tion. Most members of groups 1 and 2 appear to have 
arisen before the divergence of the monocots and dicots, 
while group 3 WRKY genes seem to have had a relative 
later origin [17]. In addition, a recent study showed that 
expression divergence had occurred among duplicated 
WRKY genes [18]. However, the reasons for expression 
divergence among duplicated WRKY genes remain 
unclear. 

Cucumber is not only an economically important cul- 
tivated plant, but also a model system for studies on sex 
determination and plant vascular biology [39]. A draft of 
the Cucumis sativus var. sativus L. genome sequence 
was reported recently [40]. In this study, we searched 
this genome sequence to identify the WRKY genes of 
cucumber (CsWRKY). Then, we analyzed the expression 
of the identified CsWRKY genes under normal growth 
conditions and under various abiotic stresses conditions. 



We compared the structure of the encoded proteins and 
the expression profiles of CsWRKY genes with those of 
their putative homologs in Arabidopsis thaliana WRKY 
(AtWRKY) genes, and found that there were notable dif- 
ference between group 3 WRKY genes of Arabidopsis 
and cucumber. The evolutionary analysis of group 3 
WRKY genes indicated that, unlike cucumber, the 
recent duplicated WRKY genes of Arabidopsis have 
been under positive selection pressure. This may explain 
the expression divergence of their orthologs. These stu- 
dies will be useful for understanding the role of WRKY 
genes in plant responses to abiotic stresses. In addition, 
these results provide information about the relationship 
between evolution and functional divergence of the 
WRKY family. 

Results 

Identification of WRKY family in cucumber 

A total of 57 genes in the cucumber genome were iden- 
tified as possible members of the WRKY superfamily 
and they encoded 57 WRKY proteins. Among these pro- 
teins, annotation of eight proteins revealed that they 
have two complete WRKY domains each. A total of 52 
WRKY genes could be mapped on the chromosomes 
and were renamed from CsWRKY 1 to CsWRKY52 based 
on their order on the chromosomes, from chromosomes 
1 to 7 (Figure 1). Five WRKY genes {Csa018657, 
Csa018622, Csa018069, Csa018094 and Csa022995) that 
could not be conclusively mapped to any chromosome 
were renamed CsWRKY53-CsWRKY57 respectively. In 
addition, the nucleotide sequence of Csa026380 was 
completely identical to that of Csa014665, therefore; the 
latter was eliminated from this study. 

Next, to establish whether these WRKY genes are 
expressed, we screened the cucumber EST database in 
NCBI. Twenty-seven putative WRKY genes matched at 
least one EST hits (Table 1). We cloned and sequenced 
full-length cDNAs of 32 of the annotated CsWRKY 
genes (Table 1). Consequently, annotation errors of 17 
putative WRKY genes could be corrected (data not 
shown). All CDSs of 32 CsWRKY genes have been sub- 
mitted to GenBank and their accession numbers in Gen- 
Bank were showed on Table 1. 

Multiple sequence alignment, structure and phylogenetic 
analysis 

The phylogenetic relationship of the CsWRKY proteins 
was examined by multiple sequence alignment of their 
WRKY domains, which span approx 60 amino acids (Fig- 
ure 2). A comparison with the WRKY domains of several 
different AtWRKY proteins resulted in a better separation 
of the different groups and subgroups. For each of the 
groups or subgroups, 1, 2a to 2e and 3, one representa- 
tive was chosen randomly. These were: AtWRKY20, 40, 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 3 of 20 



CsWRKY15^ 
CsWRKYI^T 



CsWRKYI 



C8WRKY3L 



CsWRKY2 



A 



C8WRKY5 



CsWRKY7 



_CsWRKY4 



CSWRKY9 



CsWRKY1£L 



CsWRKYI 6 



CsWRKYI 8 



CsWRKY26_ft 



CsWRKY20 



CsWRKY27 



■ 



CsWRKY28 CsWRKY29 



CsWRKYS 



CsWRKYS CsWRJ^J 0 
C8WRKY13 



CsWRKYlO CsWRKY21^ 
CsWRKY23 



CsWRKY12 
CsWRKYU 



CsWRKY25 



CsWRKY22 
_CsWRKY24 



CsWRKY3Q__ 



CsWRKY32__ 



CSWRKY31 



CsWRKY33 



Chrl 



Chr2 



Chr3 



Chr4 



CsWRKY3 



CsWRKY35 



CsWRKY53_: 



CsWRKY38_p CsWRKY39 CsWRKY4Z_ _ 



CsWRKY36__ 



CsWRKY4Q__ 

CsWRKY42_ = — CsWRKY41 



CsWRKY44 



CsWRKY37 



CsWRKY46__ 



, CsWRKY- 



CSWRKY54 



ScaffoldOOOll 

_CsWRKY48 CSWRKY55 T 



Cs^ 

Cs^ 



;WRKY43 

WRKY45 CsWRKY6j__L 



CsWRKY50 



CsWRKY52 



Chr5 



Chr6 



Chr7 



CSWRKY56 

ScafFoldOOOlOO 

CsWRKYSZ^. 

Scaffold000481 



Figure 1 Mapping of the WRKY gene family on Cucumis sativus L. chromosomes. The size of a chromosome is indicated by its relative 
length. To simplify the presentation, we renamed the putative WRKY genes from CsWKRYl to CsWRKY52 based on their order on the 
chromosomes. Five putative WRKY genes could not be localized on a specific chromosome, so we renamed them from CsWRKY53 to CsWRKY57 
according to their raw scores in a search of cucumber WRKY proteins with the Hmmsearch program. 



Ling et al. BMC Genomics 201 1, 12:471 Page 4 of 20 

http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Table 1 WRKY genes in cucumber 



Gene 



Annotation 
ID 



GenBank 
accession 



Predicted ORF 
length 



Predicted gene 
length* 



EST 
hits 



Expressed* 



Obtained CDS 

sequence*** 



CsWRKYl 
CsWRKY2 
CsWRKY3 
CsWRKY4 
CsWRKY5 
CsWRKY6 
CsWRKY7 
CsWRKY8 
CsWRKY9 
CsWRKY10# 
CsWRKYl 1 
CsWRKY12 
CsWRKY13 
CsWRKYl 4 
CsWRKY15 
CsWRKY16## 
CsWRKYl 7 
CsWRKY18 
CsWRKYl 9 
CsWRKY20 
CsWRKY21 
CsWRKY22 
CsWRKY23 
CsWRKY24 
CsWRKY25 
CsWRKY26 
CsWRKY27 
CsWRKY28 
CsWRKY29 
CsWRKY30 
CsWRKY31 
CsWRKY32 
CsWRKY33 
CsWRKY34 
CsWRKY35 
CsWRKY36 
CsWRKY37 
CsWRKY38 
CsWRKY39 
CsWRKY40 
CsWRKY41 
CsWRKY42 
CsWRKY43 
CsWRKY44 
CsWRKY45 
CsWRKY46 
CsWRKY47 
CsWRKY48 
CsWRKY49 
CsWRKY50 
CsWRKY51 



Csa005379 
Cso004516 
Csa003764 
Cso016371 
Cso015868 
Csa017345 
Cso001650 
Csa006570 
Cso026380 
Csa014665 
Cso005866 
Cso005867 
Cso005948 
Cso001212 
Csa018420 
Csa018419 
Cso020112 
Cso000336 
Cso008740 
CsoO 19944 
Csa004863 
Cso004896 
Cso004828 
Csa004742 
Csa002274 
Cso002896 
Csa002813 
Cso016219 
Csa016218 
CsoO 10443 
Cso020355 
Csa014848 
Csa009473 
Cso016087 
CsaO 16061 
CsaO 15442 
Cso009672 
CsoO 19857 
CsaO 19858 
Cso019119 
Cso013101 
Cso013154 
Cso010294 
CsoO 10089 
CsoO 10221 
Cso000701 
Cso003388 
Cso013553 
Cso013650 
Cso007193 
CsoO 16725 



GU984009 
GU984010 
GU98401 1 



GU984012 



GU984014 

GU984015 
GU984016 

GU984017 
GU984018 
GU984019 

GU984020 
GU984021 
GU984022 
GU984023 
GU984024 
GU984025 



GU984026 
GU984027 
GU984028 



GU984029 
GU984030 



GU984031 



GU984032 
GU984033 

GU984034 
GU984035 
GU984036 



1773 
1731 
1839 
1521 



2184 
1047 

768 

540 

399 

882 

681 

1506 

1581 

1005 

1239 

849 

948 

843 

1431 

1473 

939 

645 

873 

315 

810 

840 

1068 

975 

1152 

822 

954 

918 

1521 

732 

453 

522 

510 

618 

546 

432 

885 

786 

897 

1449 

1302 

876 

1056 



3659 
2527 
3302 
3200 
1150 
1027 
2800 
10512 
1704 

1648 
953 
630 
1364 
758 
2683 
6663 
1202 
2839 
1123 
1321 
962 
2653 
2219 
1614 
1198 
1123 
1475 
1328 
2017 
1737 
2909 
1559 
2410 
5996 
1432 
4068 
3117 
592 
522 
3539 
2623 
2318 
2005 
1063 
1754 
2148 



1983 
1554 
1726 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 5 of 20 



Table 1 WRKY genes in cucumber (Continued) 



CsWRKY52 


Csa001863 


GU984037 


729 


2911 


+ 


+ 


CsWRKY53 


CsoO 18657 


GU984038 


741 


2095 


1 + 


+ 


CsWRKY54 


Cso018622 


GU984039 


240 


1886 


+ 


+ 


CsWRKY55 


CsoO 18069 


GU984040 


807 


2807 


1 + 


+ 


CsWRKY56 


Csa018094 


GU984041 


498 


2565 


+ 


+ 


CsWRKY57 


Csa022995 




972 


1454 


+ 





Note: 

* Include intron length; 

** Expression of WRKY genes was detected in a variety of cucumber tissues by RT-PCR. +: expressed WRKY genes, -: no signal was detected; 
*** The CDS of WRKY genes obtained by RT-PCR; +: obtained. 

# Annotated CsWRKY9 and CsWRKYW were actually one gene. 

## CsWRKY15 and CsWRKYW were two domains of one WRKY gene. 



72, 50, 74, 65 and 54. As shown in Figure 2, the 
sequences in the WRKY domain were highly conserved. 

Sequence comparisons, phylogenetic and structural 
analyses showed that the WRKY domains could be clas- 
sified into three large groups corresponding to groups 1, 
2 and 3 in Arabidopsis as shown by Eulgem et al, 2000 
(Figure 3). It is worth noting that group 1 contained 12 
CsWRKY proteins, eight of which contained two WRKY 
domains. However, the other four {CsWRKY 15, 
CsWRKY 16, CsWRKY38 and CsWRKY39) contained 
only one WRKY domain but clustered with CTWD (C- 
terminal WRKY domains) and NTWD (N-terminal 
WRKY domains) respectively. Our study further showed 
that CsWRKY15 and CsWRKY16 were actually two 
domains of one WRKY protein, while CsWRKY38 and 
CsWRKY39 were two independent WRKY proteins. 
Domain acquisition and domain loss events appear to 
have shaped the WRKY family [41,42]. Thus, 
CsWRKY38 and CsWRKY39 may have arisen from a 
two-domain WRKY protein that lost one of its WRKY 
domains during evolution. The structure and phyloge- 
netic tree of the CsWRKY domain clearly indicated that 
group 2 proteins can be divided into five distinct sub- 
groups (2a-e). Compared with the group 3 proteins in 
Arabidopsis (14 members), there are only 6 CsWRKY 
proteins in group 3. Whereas genome duplication events 
have resulted in the expansion of the WRKY genes in 
Arabidopsis and rice [17], it appears that these events 
have not occurred in the cucumber WRKY family. 
Although Huang et al [40] reported that the cucumber 
genome shows no evidence of recent whole-genome 
duplication and tandem duplication. We used the 
method of Schauser et al [43] to search for small dupli- 
cation blocks in CsWRKY family, but none were found. 
In addition, a rooted phylogenetic tree of WRKY 
domains was also constructed to identify putative ortho- 
logs in Arabidopsis and cucumber (additional file 1). All 
orthologs are listed in additional file 2. 

Analysis of the structure of CsWRKY genes showed 
that all WRKY genes except CsWRKY40 had at least 



one intron insert. Two major types of intron splicing 
were found in the conserved WRKY domains of 
CsWRKY genes (Figure 2), which are similar to WRKY 
domains in AtWRKY genes. However, the length of the 
conserved introns was 2.8 times greater in cucumber 
(-686 bp) than in Arabidopsis (-241 bp). Coincidentally, 
this rate was very similar to the size difference (2.9 
times) between the genome of cucumber (376 Mb) and 
Arabidopsis (125 Mb). The conserved motifs of WRKY 
family proteins in cucumber and Arabidopsis were 
investigated using Meme version 4.4 as described in the 
Methods (additional file 3), and a schematic overview of 
the identified motifs is given in additional file 4. As dis- 
played schematically in Figure 4, except for the mem- 
bers of group 2c and group 2e, one or more 
conservative motifs outside of the WRKY domain motif 
can be detected in a WRKY protein. The CsWRKY and 
AtWRKY proteins from the groups 1 and 2, always 
share the same conserved motifs. In contrast, the mem- 
bers of group 3 AtWRKY {AtWRKY 63, AtWRKY 64, 
AtWRKY66 and AtWRKY67) show an Arabidopsis-speci- 
fic conserved motifs (motifs 6, 7 and 8; additional file 3), 
but other members of group 3 share the same conserved 
motifs with other CsWRKY proteins. 

Expression profile of CsWRKY genes under normal growth 
conditions and under various abiotic stress conditions 

We analyzed the expression of all CsWRKY genes under 
normal growth conditions in seven different tissues: 
cotyledons, leaves, roots, stems, female flowers, male 
flowers and fruits. Not all of the predicted genes were 
expressed in plants grown under normal growth condi- 
tions. Among 55 predicted genes, 48 genes (87%) were 
expressed in at least one of the seven tissues (Figure 5). 
The other seven genes did not show any detectable 
expression as tested by RT-PCR in the above tissues, 
but they may be expressed in other tissues, e.g., seeds. 
Also, some of the CsWRKY genes may be pseudogenes. 
The following ten genes were expressed in all tested tis- 
sues with relatively higher expression intensities: 



Ling et al. BMC Genomics 201 1, 12:471 Page 6 of 20 

http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Gr ou pi 
At WRKY20C 
CsWRKYl 7C 
Cs WRKY2C 
CsWRKYl 5 
CSWRKY8C 
CSWRKY37C 
Cs WRKY39 
CSWRKY23C 
CsWRKY49C 
CSWRKY4C 
Cs WRKY24C 

At WRKY20N 
CsWRKYl 7N 
CSWRKY37N 
CSWRKY8N 
CsWRKYl 6 
CSWRKY24N 
Cs WRKY2N 
Cs WRKY23N 
Cs WRKY38 
CSWRKY4N 
CSWRKY49N 



Gr oup2a 
At WRKY4 0 
CSWRKY21 
Cs WRKY32 
CsWRKYl 1 
CsWRKYl 2 



Group2b 
At WRKY72 
CsWRKYl 
CSWRKY3 
CsWRKYl 9 
CSWRKY48 

Group2c 

At WRKY50 

Cs WRKY4 1 

CSWRKY44 

Cs WRKY54 

CSWRKY40 

CSWRKY57 

CsWRKYSS 

CSWRKY52 

CSWRKY46 

CSWRKY42 

CSWRKY26 

Cs WRKY56 

CsWRKYl 3 

Cs WRKY28 

CSWRKY43 

Cs WRKY53 

CSWRKY30 

Group2d 
At WRKY74 
CSWRKY51 
CsWRKYl 0 
CSWRKY9 
CsWRKYl 4 
CSWRKY33 
CSWRKY45 
CSWRKY25 
Cs WRKY5 

Group2e 
At WRKY65 
CSWRKY7 
CSWRKY6 
Cs WRKY4 7 
CsWRKYl 8 
Cs WRKY29 
Cs WRKY36 
CSWRKY27 

Group3 
At WRKY54 
Cs WRKY34 
CSWRKY22 
CSWRKY20 
CSWRKY50 
CSWRKY31 
CSWRKY35 



SEVDILDDGYRWRKYGQK VVRGNPNPRSYYKCT- 
SEVDILDDGYRWRKYGQK VVRGNPNPRSYYKCT- 
SDI DILDDGYRWRKYGQK VVKGNPNPRSYYKCT- 
SEVDI L DDGYRW RKYGQK VVKGNPNPRSYY KCT - 
TEVDILEDGYRW RKYGQK VVKGNPNPRSYYKCT- 
SEVDLLDDGYRWRKYGQK VVKGNPNPRSYYKCT- 
SNVDKLDDGYWWRKYGQK VVKGNPNPRSYYKCT- 
SEI DILPDGYRWRKYGQK VVKGNPNPRSYYKCT- 
SEVDIVNDGYRWRKYGQKFVKGNPNPRSYYRCS- 
GDVGISGDGYRWRKYGQKMVKGNPHPRNYYRCT- 
TGI EI SGKGVRW RKYGQK VVKGNL YPRSYY RCT - 



- AHGCP VRKHVERAS - 
- NVGCPVRKHVERAS - 
- NPGCP VRKHVERAS - 
- NPGCT VRKHVERAS - 
- SAGCL VRKHVERAS - 
- SAGCN VRKHVERSS - 
- YPGCGVRKHIERAS - 
- SLGCPVRKHIERAA - 
-SPGCPVKKHVERAS- 
-SAGCPVRKHIESAV- 
- GLKCK ARKYVERAS - 



- HDPKAVITTYEG 
- HDPKAVITTYEG 
- HDLRAVITTYEG 
- HDLKSVITTYEG 
- HDLKCVITTYEG 
- TDSKAVVTTYEG 
- HDFRAVVTTYEG 
- NDMRAVITTYEG 
- HDPKI VLTTYEG 
- ENPNAVIITYKG 
- EDPDSFITTYEG 



KHDHDVP 
KHNHDVP 
KHNHDVP 
KHNHDVP 
KHNHEVP 
KHNHDVP 
KHNHDIP 
KHNHEVP 
QHDHVVP 
VHDHDTP 
KHNHGIS 



TPSILADDGYNWRKYGQK 
VSDRLSDDGYNWRKYGQK 
GSDKPADDGYNWRKYGQK 
GML RTSEDGYNWRKYGQK 
ACGTPSEDGYNWRKYGQK 
SGAQPSYDGYNWRKYGQK 
TVNRRSDDGYNWRKYGQK 
EQQKSENDGYNWRKYGQK 
PNRSGSEDGFNWRKYGQK 
NARTPASDGYNWRKYGQK 
IREKVSEDGFNWRKYGQK 



HVKGSEFPRSYYKCT- 
HVKGSEFPRSYYKCT- 
LVKGSEFPRSYYKCT- 
QVK GSE YPRSYY KCT - 
QVKGSEYPRSYYKCT- 
QVK GSE YPRSYY KCT - 
QVKGSENPRSYYKCT- 
Q VK GSE NPRSYY KCT - 
VVKGSENPRSYYKCT- 
QVKSPKGSRSYYKCT- 
LVKGNVFVRSYYRCT - 

t 



- HPNCE VKKLFE 
- HPNCE VKKLFE 
• HLNCP VKKKIE 
• HPNCL VKKKVE 
- HPNCQ VKKKVE 
-HPSCP VKKKVE 
- FPNCPTKKKVE 
- FPSCPTKKKVE 
- FPNCP VRKQVE 
- YSECF AKK - IE 
- HPTCM VKKQLE 



RSHD- - 
RSHD- - 
RSPD- - 
RSLD- - 
RSHE- - 
RSLD- - 
RSLD- - 
RSLD- - 
PSLNNN 
CCDDS- 
RTHD- - 



GQITDII YKGTHDHPKP 

GQITDII YKGTHDHPKP 
• GQITEII YKGQHNHEPP 

GQITEII YKGAHNHAKP 
• GHITEII YKGTHNHPKP 
■ GKVAEI VYKGEHNHPKP 
• GQITEI VYKGSHNHPKP 
■ GQITEI VYKGTHNHAKP 

GQITEI VYKSKHNHPKP 
• GQTTEI VYKSQHSHDPP 

GKITDTVYFGQHDHPKP 



DTTLVVKDGYQWRKYGQK VTRDNPSPRAYFKC AC- - APSCS VKKKVQRSV - 
DSNLVVKDGYQWRKYGQK VTRDNPCPRAYFKCSF- - APSCP VKKKVQRSV - 
DPSLVVKDGYQWRKYGQK VTRDNPSPRAYFKC SS- - APNCP VKKKVQRSL - 
DSTLI VKDGYQWRKYGQK VTKDNPSPRAYYKCSF- - APTCP VKRKVQRSV - 
DQALMVKDGYKWRKYGQKITKDNQSPRAYFKCS- 



CDTPTMNDGCQWRKYGQK I AKGNPCPRAYYRCTV- 
CDT PTMNDGCQW RKYGQK I AK GNPCPRAYY RCTG - 
CET ATMNDGCQWRKYGQKI AKGNPCPRAYYRCTG- 
CESATMNDGCQWRKYGQKI AKGNPCPRAYYRCTV- 
SEAPMITDGCQWRKYGQKMAKGNPCPRAYYRCTM- 



ICPVK^KVQI 



• APGCP VRKQVQRCA - 
APTCP VRKQVQRSV - 
SPTCP VRKQVQRCA - 
• APGCP VRKQVQRCL - 
ALGCP VRKQVQRCA - 



SEVEVL 
SEVEIL 
SELEIL 
M 



SEVDHL 
SDI DHL 
SEVDHL 
SEVDHL 
SEVDHL 
SEVDHL 
SQVDIL 
SQVDIL 
SQVDVL 
SVEDVL 
TDVDVL 
SDVDVL 
GGNMVA 



DDGFKWRKYGKKMVKlv 
DDGFKWRKYGKKMVKN 
DDGFKWRKYGKKSVKN 
DDGYKWRKYGKKSVKIv 
DDGYRW RKYGQK AVKN 
DDGYRW RKYGQK AVKN 
EDG YRW RKYGQK A VKN 
EDGYRW RKYGQK AVKN 
EDGYRW RKYGQK AVKN 
EDGYRW RKYGQK AVKN 
DDGYRW RKYGQK AVKN 
DDGYRW RKYGQK AVKN 
DDGYRW RKYGQK AVKN 
DDGYRW RKYGQKAVKH 
DDGYKW RKYGQK V VKN 
DDGYKW RKYGQK VVKN 
DDGYKW RKYGQK SI KN 









t 






NSPHPRNYY 


KCS- - 


-VD 


GCPVKK 


RVE 


RDR 


NSPNPRNYY 


KCS- ■ 


-VE 


GCPVKK 


rve 


RDR 


NSPHPRNYY 


KCS- - 


-SG 


ECGVKK 


PVE 


RDR 


NSPNPRNYY 


KCS- - 


-SE 


GCNVKK 


KVE 


RDR 


NSPYPRSYY 


RCT- • 


-TA 


GCGVKK 


PVE 


RSS 


NSPYPRSYY 


RCT- - 


-TA 


GCGVKK 


PVE 


RSS 


NSPFPRSYY 


RCT - - 


-SA 


ACNVKK 


PVE 


RSF 


NSPHPRSYY 


RCT - - 


-SV 


ACNVKK 


PVE 


RCL 


NSAYPRSYY 


RCT - ■ 


-TQ 


KCGVKK 


PVE 


RSY 


NSPFPRSYY 


RCT - - 


-NS 


KCTVKK 


PVE 


RSC 


NNKFPRSYY 


RCT - • 


-HQ 


GCKVKK 


Q v Q 


RLT 


NNKFPRSYY 


RCT - - 


-HQ 


GCNVKK 


QVQ 


RLT 


NNKFPRSYY 


KCS- ■ 


- NE 


GCKVKK 


QXQ 


RLT 


HSNHPRSYY 


RCT - ■ 


-HHTCNVKK 


QXQ 


RHS 


NTLHPRSYY 


RCT - • 


- EE 


NCKVKK 


PVE 


RLA 


NTQHPRSYY 


RCT - - 


-QD 


HCRVKK 


PVE 


RLA 


NSPNPRSYY 


RCS- - 


-NP 


RCS AKK 


QVE 


RSI 



- EDQSVL VATYEGEHNHPMP 
- EDQSVL VATYEGEHNHPHP 
- EDPTIL VAT YEGEHS HASH 
- EEPCYL VATYEGQHNHPKP 
- ENKSMVI VTYDGHHNHNHN 



- DDMSIL ITTYEGTHSHSL P 

- DDI SILITTYEGTHNHPLP 
- DDMSIL ITT YEGNHNHPLP 

- EDM SILITTYEGTHNHPLP 
- EDKTIL ITT YEGNHNHPLP 



- DDPSFVITTYEG 
- EDPKYVITTYEG 
- DDSSYVITTYEG 
- EDANYVITTYEG 
- DDPSI VVTTYEG 
- GDHTI VVTTYEG 
- ADPTVVVTTYEG 
- QDPSI VVTTYEG 
- EDPSI VITTYEG 
- EDSSVVITTYEG 
- RDEGVVVTTYEG 
- RDEGVVVTTYEG 
- NDEGVVLTTYEG 
- KDPTI VVTTYEG 
- DDPRMV ITTYEG 
- EDPRMVITTYEG 
- EDPDIFIITYEG 



SHNHSSM 
VHTHESS 
VHNHESP 
IHNHESP 
QHTHQSP 
QHTHQSP 
QHTHPSP 
QHTHPSP 
QHNHLIP 
QHCHHTV 
IHSHPIE 
MHTHSID 
VHSHPIE 
IHNHPSE 
RHAHSPS 
RHVHSPS 
LHLHFAY 



KIADI 
KLADI 
KLADI 
KLADI 
KNADI 
KLADI 
KMADI 
KIADI 
KIADI 



PPDEYSW RKYGQK PI 
PPDDYSW RKYGQK PI 
PSDDYSW RKYGQK PI 
PSDDYSW RKYGQK PI 
PPDDYSW RKYGQK PI 
PPDDYSW RKYGQK PI 
PPDDYSW RKYGQK PI 
PPDEYSWRKYGQKPI 
PSDEYSW RKYGQK PI 



KGSPHPRGYYKCSS- 
KGSPHPRGYYKCSS- 
KGSPHPRGYYKCSS- 
KGSPHPRGYYKCSS- 
KGSPYPRGYYKCSS- 
KGSPHPRGYYKCSS- 
KGSPHPRGYYKCSS- 
KGSPYPRGYYKCST- 
KGSPYPRGYYRCSS- 

t 



• VRGCP ARK 
MRGCPARK 
IRGCPARK 
IRGCPARK 

■ LRGCP ARK 
LRGCPARK 
VRGCP ARK 
MRGCPARK 

• VKGCP ARK 



HVERCV- 
HVERCL - 
HVERCL - 
HVERCL - 
HVERAS - 
HVERAL - 
HVERAV - 
HVERDP - 
KVERAR- 



- EETSMLI VTYEG 
- EEPSMLI VTYEG 
- EDPSMLI VTYEG 
- EDPSMLI VTYEG 
- DDPSMLIVTYEG 
- DDPTMLI VTYEN 
- DDPAML VVTYEG 
- NDPAMLI VTYEG 
- DDPAMLL VTYEG 



EHNHSRI 
EHNHPRI 
EHNHPKM 
EHNHPKM 
DHNHSQS 
DHNHAHS 
EHNHTLS 
EHRHTQS 
DHRHPHP 



GDTTPPSDSWAW RKYGQK PI K 
SGE VVPSDLWAWRKYGQKPIK 
GSATPPSDSWAW RKYGQK PI K 
GEAYPPSDSWAWRKYGQKPIK 
PAESLS SDI WAW RKYGQK PI K 
KADKVCSDSWGW RKYGQK PI K 
TAD NLSTDMWAW RKYGQK PI K 
KNE GPP PDFWSW RKYGQK PI K 



GSPYPRGYYRCSS- 
GSPYPRGYYRCSS- 
GSPYPRAYYRCSS- 
GSPYPRGYYRCSS- 
GSPYPRGYYRCSS- 
GSPYPRSYYRCSS- 
GSPYPRNYY RC SS - 
GSPYPRGYYRC ST- 

t 



TKGCPARK 
SKGCSARK 

• SKGCP ARK 

• SKGCP ARK 
■ SKGCM ARK 

SKGCSARK 
SKGCGARK 
TKGCSAKK 



QVERSR- 
QVERSR- 
QVERNR- 
QVERSR- 
QVERNR- 
QVERSL - 
QVERSN - 
QVERCK- 



- DDPTMI LITYTS 
- TNPNMLVITYTS 
- LDPTTLVITYSC 
- VDPTKLVITYAF 
- SDPGMFI VTYTA 
- SDPEVFI VTYTA 
- DDPETFTITYTG 
- TDGSMFIITYTS 



EHNHPWP 
EHNHPWP 
EHNHSGP 
DHNHQLP 
EHNHPAP 
EHNHAEP 
DHSHPRP 
SHNHPGP 



VEAKSSEDRYAWRKYGQKEILNTTFPRSYF 
RTS RTTEDNYGW RKYGQK AIH NTT YPRSYY 
ESCDLVDDGHAWRKYGQKTILNAKYPRNYY 
AIEGSLDDGF AW RKYGQK GILGAKHPRGYY 
AVE GPGCDGFSW RKYGQK DILGSKFPRSYF 
GFE GPH EDG YSW RKYGQK DIL GAT YPRSYY 
NTELPPDDGFTW RKYGQK EI LGSRFPRGYF 

t 



RCTHKPTQGCK ATKQVQKQD - Q- 
RCTHKFDQGCQ ATKQVQRMEGD- 
RCTHKYDQTCQATKQVQRLQ - - - 
RCTHRNLQGCL ATKQVQRSD - - - 
RCSHRFTQGCL ATKQVQKSD - - ■ 
RCTFRNTQNCWAVKQVQRSD - - - 
RCTHQKLYHCP AKKHVQRLD - - - 



DSE - MFQITYIGYHT CTAN 
DSEIMYNITYISDHTCRRP 
DNPPKFRTTYYGNHTCSNF 
DDPTIFEITYRGKHSCSQ V 
NDPTIYEVTYKGRHTCNK A 
EDPSVFEITYRGKHTCSQG 
DDPHTFEVTYRGEHTCHMS 



Figure 2 Alignment of multiple CsWRKY and selected AtWRKY domain amino acid sequences. Alignment was performed using Clustal W. 
The suffix 'N' or 'C indicates the N-terminal WRKY domain or the C-terminal WRKY domain, respectively, of a specific WRKY protein. The amino 
acids forming the zinc-finger motif are highlighted in yellow. The conserved WRKY amino acid signature is highlighted in grey, and gaps are 
marked with dashes. The position of a conserved intron is indicated by an arrowhead. 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 7 of 20 





O 



AtWRKYl# 

AtWRKY43 
CsWRKY28 
AtWRKY24 
AtWRKY56 

m 




J S WRKY5 
AtWRKY65 
CsWRKY6 






Figure 3 Unrooted phylogenetic tree representing relationships among WRKY domains of cucumber and Arabidopsis. The amino acid 
sequences of the WRKY domain of all CsWRKY and AtWRKY proteins were aligned with Clustal W and the phylogenetic tree was constructed 
using the neighbor-joining method in MEGA 4.0. Group 1 proteins with the suffix 'N' or 'C indicates the N-terminal WRKY domains or the C- 
terminal WRKY domains. The red arcs indicate different groups (or subgroups) of WRKY domains. Diamonds represent orthologs from cucumber 
(blue) and Arabidopsis (red). 



CsWRKY2, CsWRKY7, CsWRKY 14, CsWRKY 17, 
CsWRKY25, CsWRKY37, CsWRKY41, CsWRKY44, 
CsWRKY49 and CsWRKYS7. Five WRKY genes 
(CsWRKYS, CsWRKY 13, CsWRKY23, CsWRKY28 and 
CsWRKYSS) were expressed at relatively low levels in all 
the tested tissues. 

We used RT-PCR analyses to examine the expression 
of CsWRKY genes in response to three different abiotic 
stresses: cold, drought and salinity. Of the 48 expressed 



CsWRKY genes, 23 showed differential expressions in 
response to at least one stress, whereas the other 25 did 
not (Table 2). It should be noted that none of the 
stress-inducible CsWRKY genes belongs to group 3. We 
conducted real-time PCR analyses to confirm and quan- 
tify the expression levels of the 23 stress-inducible 
WRKY genes in response to abiotic stresses. As shown 
in Figure 6, RT-PCR and real-time PCR generally gave 
the same results for the expression profiles and 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 8 of 20 



Group 1 

AtWRKY25 - 
CsWRKY23 - 

Group 2a 

AtWRKY40 - 
CsWRKY21 - 

Group 2b 

AtWRKY72 - 
CsWRKYl - 

Group 2c 

AtWRKY28- 
CsWRKY43- 

Group 2d 

AtWRKY39. 
CsWRKY25- 

Group 2e 

AtWRKY22 - 
CsWRKYl 8 - 

Group 3 

AtWRKY41 - 
AtWRKY63 J 
CsWRKY50- 

■ WRKY motif motif 1 ■motif2 ■motif3U motif4Bmotif5 
■motif6Bmotif7Bmotif8 

Figure 4 Schematic diagram of amino acid motifs of CsWRKY 
and AtWRKY proteins from different groups (or subgroups). 

Motif analysis was performed using Meme 4.0 software as described 
in the Methods. The selected WRKY proteins are listed on the left. 
The black solid line represents the corresponding WRKY protein and 
its length. The different-colored boxes represent different motifs and 
their position in each WRKY sequence. A detailed motif introduction 
for all CsWRKY proteins is shown in additional file 4. 



abundance of transcripts. However, in rare instances, the 
difference in expression detected by real-time PCR was 
more significant than that detected by RT-PCR (Figure 
5E). As shown in Table 2, the results of real-time PCR 
showed that most of the stress-responsive genes were 
upregulated in response to abiotic stress (Figure 6A, B, 
C), and only three genes were downregulated (Figure 
6D). As determined by real-time PCR analysis, there 
were no differences in the expressions of six group 3 
CsWRKY genes in response to abiotic stress (Figure 6F). 

Comparison of abiotic stress-inducible orthologs between 
cucumber and Arabidopsis 

We compared the expressions of CsWRKY genes with 
those of their possible orthologs in Arabidopsis under 
abiotic treatment. As shown in additional file 5, except 
for group 3 WRKY genes, Arabidopsis WRKY genes 
whose orthologus CsWRKY genes were not induced by 
abiotic treatments were also not stresses-inducible. In 
addition, most of orthologous AtWRKY genes of stress- 
inducible CsWRKY genes also responded to at least one 
stress-type treatment. These findings imply a possible 
correlation between the expression profiles of these 
orthologs in Arabidopsis and cucumber in response to 
abiotic stresses. Among the CsWRKY genes whose 
expressions changed in response to abiotic stress, there 
were 13 for which stresses-inducible orthologs existed in 
Arabidopsis (additional file 5). To investigate whether 
the expressions of these orthologs were correlated 
between the two species, we compared the expressions 



CsWRKY2 
CsWRKY6 
CsWRKY9 
CsWRKY 14 
CsWRKY 18 
CsWRKY21 
CsWRKY21 
Cs\VRKY27 
Cs\VRKY32 
CsWRKY35 
CsWRKY38 
CsWRKY 11 
CsWRKYll 
CsWRKY19 
CsWRKY52 
CsWRKY55 

Actin 





CsWRKYl 
CsWRKY? 
CsWRKY 12 
CsWRKY 15 
CsWRKY 19 
CsWRKY22 
CsWRKY25 
CsWRKY28 
CsWRKY33 
CsWRKY36 
CsWRKY39 
CsWRKY42 
CsWRKY46 
CsWRKY50 
CsWRKY53 
CsWRKY56 
Actin 




CsWRKY5 | 
CsWRKY8 | 
CsWRKY 13 
CsWRKY 17 
CsWRKY20 | 
CsWRKY23 | 
CsWRKY26 | 
CsWRKY31 
CsWRKY34 
CsWRKY37 | 
CsWRKY40 | 
CsWRKY43 | 
CsWRKY47 
CsWRKY51 
CsWRKY54 | 
CsWRKY57 | 
Actin 



Figure 5 Expression profiles of cucumber WRKY genes in various tissues as determined by RT-PCR analyses. Seven amplified bands from 
left to right for each WRKY gene represent amplified products from cotyledons, leaves, roots, stems, female flowers, male flowers and fruits. 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 9 of 20 



Table 2 CsWRKY gene expression patterns under abiotic 

stress as determined by RT-PCR and real-time PCR. 

Gene Cold Salt Dry Gene Cold Salt Dry 

CsWRKY2 + + + CsWRKY32 nc nc nc 

CsWRKY4 + nc nc CsWRKY33 + nc nc 

CsWRKY5 nc nc nc CsWRKY34 nc nc nc 

CsWRKY6 nc nc nc CsWRKY35 nc nc nc 

CsWRKY7 nc nc nc CsWRKY36 + nc nc 

CsWRKY8 nc nc nc CsWRKY37 nc nc nc 

CsWRKY9 nc nc nc CsWRKY38 nc nc nc 

CsWRKYU nc nc nc CsWRKY39 nc + + 

CsWRKY13 nc nc nc CsWRKY40 ++ ++ ++ 

CsWRKY 14 nc + + CsWRKY41 nc + nc 

CsWRKY15 nc nc nc CsWRKY42 nc + nc 

CsWRKYU nc nc nc CsWRKY43 nc + + 

CsWRKYW ++ + ++ CsWRKY44 nc + + 

CsWRKY] 9 nc nc nc CsWRKY46 + ++ + 

CsWRKY20 nc nc nc CsWRKY47 nc nc nc 

CsWRKY21 ++ ++ ++ CsWRKY49 nc nc nc 

CsWRKY22 nc nc nc CsWRKY50 nc nc nc 

CsWRKY23 + nc CsWRKY51 nc nc nc 

CsWRKY24 nc nc nc CsWRKY52 nc + + 

CsWRKY25 ++ nc nc CsWRKY53 nc + 

CsWRKY26 nc nc nc CsWRKY54 nc + + 

CsWRKY27 nc nc nc CsWRKY55 nc ++ 

CsWRKY28 nc nc CsWRKY56 nc + + 

CsWRKY31 nc nc nc CsWRKY57 ++ nc + 

Cucumber seedlings were subjected to salt, drought and cold treatments for 

0, 0.5,1, 3, 6 12 and 24 h. 

Note: 

nc, no significant change in gene expression; +, moderate induction of gene 
expression; ++, strong induction of gene expression; -, reduction of gene 
expression. 

Student's t-test was used to obtain the statistical significance of the difference 
between treated samples and untreated samples (0 h treatment under abiotic 
stress). If P-values < 0.01, we considered the WRKY gene as an induced gene. 

of these 13 pairs of orthologs under various stresses as 
described in the Methods section. This analysis gener- 
ated a total of 22 sets of data (one pairs of orthologs 
may be induced by more than one abiotic stresses). As 
shown in Table 3, the correlation coefficients of 12 sets 
of data, more than half of the 22 sets of data, were 
greater than 0.5, indicating a positive correlation 
between the orthologous pairs under abiotic stresses 
(Figure 7A-D). The expression profiles of only two sets 
of data were negatively correlated (Figure 7G-H). Finally, 
the average correlation coefficients of 22 datasets for all 
the putative orthologous WRKY genes was 0.40 and dif- 
fered significantly (p < 0.01) from the average expression 
correlation of a control dataset composed of randomly 
chosen gene pairs (0.04) (Table 3). In contrast, when the 
correlation coefficients of group 3 CsWRKY and 
AtWRKY orthologs were calculated, there was no clear 
positive or negative correlation (Figure 7E-F). Our 



results indicated that there is a correlative expression 
profile between stress-inducible CsWRKY genes and 
their putative AtWRKY orthologs, except for the group 
3 WRKY genes. This finding suggests that the expres- 
sion of group 3 WRKY orthologs differ between cucum- 
ber and Arabidopsis. All expression data used to 
calculate correlations are shown in additional file 6. 

Evolutionary analysis of group 3 WRKY genes in 
Arabidopsis and cucumber 

The group 3 WRKY genes seem to have greatly 
expanded in angiosperms after the divergence of the 
monocots and dicots (160 Mya) [44]. Here, we further 
investigated the duplication and diversification of group 
3 WRKY genes after divergence of the eurosids I group 
(which include cucumber, soybean, and poplar) and the 
eurosids II group (which include Arabidopsis) (110 
Mya). A phylogenetic tree of WRKY proteins encoded 
by group 3 WRKY genes of Arabidopsis (14), cucumber 
(6), poplar (10), and soybean (7) was constructed using 
the most primitive WRKY domain of Giardia lamblia 
as an outgroup. This analysis showed that many mem- 
bers of the group 3 AtWRKY proteins clustered together 
and displayed the close phylogenetic relationship (Figure 
8), indicating that they arose after the divergence of the 
eurosids I and II. Two types of gene duplication events, 
tandem duplication and segmental duplication, were the 
main factors in the expansion of group 3 AtWRKY 
genes. The results of this phylogenetic analysis indicated 
that no gene duplication events have occurred in 
CsWRKY gene evolution because of no paralogs of 
cucumber can be detected. Hence, the different evolu- 
tionary patterns of group 3 WRKY in cucumber and 
Arabidopsis occurred after their divergence. 

To determine whether selection pressure had affected 
group 3 WRKY genes, we estimated the oo (dn/ds) 
values for all branches of group 3 WRKY genes in Ara- 
bidopsis and cucumber (Figure 9 and Table 4). In Arabi- 
dopsis, the ML estimate of dN/dS values for all nodes 
under model M0 were < 1, with a mean value of 0.276 
(Table 4), indicating that group 3 AtWRKY genes have 
been under purifying selection, which was the predomi- 
nant force acting on the evolution of the group 3 
AtWRKY genes. However, the log likelihood differences 
between model M3 and model M0 were statistically sig- 
nificant for all nodes tested, suggesting that selective 
pressure varied among branches and some genes might 
have been under positive selection. We further used 
model M7 and M8 of PAML to address whether posi- 
tive selection has played a role in the evolution of group 
3 AtWRKY genes. Of the eight nodes analyzed, log-like- 
lihood values were significantly higher under the M8 
model than under the M7 model for five nodes (nodes 
1, 2, 3, 4 and 5), which indicates that positive selection 



CSWRKY2 


+ 


+ 


+ 


CsWRKY 32 


nc 


nc 


r~^ \ A //") \/\/ A 

CSWRKY4 


+ 


nc 


nc 


CsWRKY 3 3 


+ 


nc 


i a /d i/\/r 

CsWRKY 5 


nc 


nc 


nc 


1 A /{-) A 

CsWRKY 34 


nc 


nc 


CsWRKY 6 


nc 


nc 


nc 


CsWRKY 3 5 


nc 


nc 


CsWRKY/ 


nc 


nc 


nc 


CsWRKY 3o 


+ 


nc 


CsWRKY 8 


nc 


nc 


nc 


CsWRKY 37 


nc 


nc 


CsWRKY 9 


nc 


nc 


nc 


CsWRKY 38 


nc 


nc 


f~„ I A ID \/\/ 1 "1 

CsWRKY 12 


nc 


nc 


nc 


CsWRKY 3 9 


nc 


+ 


CsWRKY 1 3 


nc 


nc 


nc 


r~r \ a m\/\/ a r\ 

CsWRKY 40 


++ 


++ 


CsWRKY 14 


nc 


+ 


+ 


S~ „ 1 A lp) \/\ / A 1 

CsWRKY4l 


nc 


+ 


CsWRKY 15 


nc 


nc 


nc 


r~ ^ i a /n \/\/ a n 

CsWRKY 42 


nc 


+ 


r~ a /n//\/i i 

CsWRKY 1 7 


nc 


nc 


nc 


\ a ip>\/\/ /i i 

CsWRKY43 


nc 


+ 


r~^ \ a /ni/\/i o 

CsWRKY 1 8 


++ 


+ 


++ 


r~r \ a /r>//\/ a a 

CSWRKY44 


nc 


+ 


CsWRKYW 


nc 


nc 


nc 


CSWRKY46 


+ 


++ 


CsWRKY20 


nc 


nc 


nc 


CsWRKY 47 


nc 


nc 


CsWRKY2l 


++ 


++ 


++ 


CsWRKY49 


nc 


nc 


CsWRKY22 


nc 


nc 


nc 


CsWRKY50 


nc 


nc 


CsWRKY23 


+ 




nc 


CsWRKY51 


nc 


nc 


CsWRKY24 


nc 


nc 


nc 


CsWRKY52 


nc 


+ 


CsWRKY25 


++ 


nc 


nc 


CsWRKY53 




nc 


CsWRKY26 


nc 


nc 


nc 


CsWRKY54 


nc 


+ 


CsWRKY27 


nc 


nc 


nc 


CsWRKY55 




nc 


CsWRKY28 




nc 


nc 


CsWRKY56 


nc 


+ 


CsWRKY3l 


nc 


nc 


nc 


CsWRKY57 


++ 


nc 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471-21 64/1 2/471 



Page 10 of 20 



A cold stress 

CsWRKY21 



B salt stress 

CsWRKY42 



C dry stress 

CsWRKY46 




Oh 0.5h 1h 3h 6h 12h 24h 

D cold stress 

CsWRKY53 



Oh 0.5h 1h 3h 6h 12h 24h 

E cold stress 

CsWRKY33 



Oh 0.5h 1h 3h 6h 12h 24h 

F dry stress 

CsWRKY50 




Oh 0.5h 1h 3h 6h 12h 24h 



Oh 0.5h 1h 3h 6h 12h 24h 



Oh 0.5h 1h 3h 6h 12h 24h 



Figure 6 Expression patterns of six selected WRKY genes under abiotic stresses. In A-F, the top panel shows the RT-PCR result and the 
bottom panel shows the corresponding real-time PCR result. For real-time PCR, the relative amount of mRNA (y-axis) was calculated by 
according to the description in Methods. The cucumber p-octin gene was used as an internal control to normalize the data. The 0, 0.5, 1, 3, 6, 
12, and 24 (x-axis) indicate the treatment time (hour) under corresponding abiotic stresses. The error bars were calculated based on three 
replicates. A-C, significant up-regulated expression of WRKY genes can be detected under abiotic stresses. D, significant down-regulated 
expression of CsWRKY53 can be detected under cold treatment. E, the expression difference detected by real-time PCR was more significant than 
that detected by RT-PCR. F, no significant expression difference can be detected in group 3 WRKY gene CsWRKY50 under abiotic stress. Statistical 
significance was obtained by using Student's t-test. 



has contributed to the evolution of group 3 AtWRKY 
genes. Interestingly, the terminal nodes with clusters of 
duplicated AtWRKY genes were all under positive posi- 
tion selection, suggesting a correlation between duplica- 
tion of genes and positive selection. Furthermore, we 
identified the positively selected sites under model M8 
using the Bayesian method. Several positive selection 
sites were detected in above five nodes but only one 
positive selection site could be detected in the region of 
WRKY domains. Thus, it appears that because of the 
high degree of conservation in WRKY domains of the 
WRKY genes, the positive selection contributed mostly 
to the regions outside of the WRKY domains. In cucum- 
ber, although the log likelihood differences between 
model M3 and model MO suggest that selective pressure 
varied among branches, there was no detectable positive 
selection in any of the nodes. Assuming that there were 



no duplication events in CsWRKY genes and that posi- 
tive selection is associated with duplication of WRKY 
genes as we described here, the extensive positive selec- 
tion events were probably followed by the group 3 
WRKY gene duplication events. This positive selection 
might be the main evolutionary force for group 3 
AtWRKY genes. Due to the absence of duplicated genes 
and positive selection in cucumber, the functions of 
group 3 CsWRKY genes might be more conservative 
than those of AtWRKY genes. 

Discussion 

Whether the CsWRKY genes were underrepresented in 
this study? 

The WRKY gene family has 72 members in Arabidopsis 
[1] and 109 members in rice [17]. In this study, we iden- 
tified a total of 55 CsWRKY genes. Compared with 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 11 of 20 



Table 3 Pearson correlation coefficients for expression profiles of orthologs* 


CsWRKY 


AtWRKY 


Stresses 


Correlation coefficient 


CsWRKY18 


MWRKY22 


cold 


0.87 


CsWRKY36 


AtWRKY27 


cold 


0.81 


CsWRKY33 


AtWRKY7 


cold 


0.77 


CsWRKY2 


AtWRKY33 


salt 


0.75 


CsWRKY 14 


AtWRKY 15 


dry 


0.74 


CsWRKY 42 


AtWRKY 57 


salt 


0.70 


CsWRKY 21 


AtWRKY40 


cold 


0.67 


CsWRKY 5 5 


AtWRKY23 


cold 


0.66 


CsWRKY 2 


AtWRKY33 


dry 


0.62 


CsWRKY 57 


AtWRKY48 


dry 


0.61 


CsWRKY25 


AtWRKY 1 1 


cold 


0.60 


CsWRKY 4 


AtWRKY 32 


cold 


0.52 


CsWRKY 57 


AtWRKY48 


cold 


0.45 


CsWRKY 40 


AtWRKY48 


dry 


0.40 


CsWRKY21 


AtWRKY40 


drv 


0.34 


CsWRKY46 


AtWRKY28 


dry 


0.14 


CsWRKY40 


AtWRKY48 


cold 


0.01 


CsWRKY2 


AtWRKY33 


cold 


-0.08 


CsWRKY25 


AtWRKY 17 


cold 


-0.09 


CsWRKYW 


AtWRKY22 


dry 


-0.11 


CsWRKY40 


AtWRKY48 


salt 


-0.33 


CsWRKY21 


AtWRKY 40 


salt 


-0.35 


Average correlation stress-induced othologous WRKY gene pairs 




0.40 




Average correlation random genes** 




0.04 



* Available expression data on AtWRKY genes from microarray analysis and that of CsWRKY genes generated by real-time PCR analysis were used to calculate the 
Pearson correlation coefficient for the expression of orthologous WRKY genes under various abiotic stresses (after 0, 0.5, 1, 3, 6, 12, and 24 h treatment)(as 
showed in Figure 7)as described in the Methods. 

**a randomly chosen abiotic stress induced cucumber WRKY gene and a randomly chosen abiotic stress induced AtWRKY gene composed of a random gene pair. 
This process was repeated a 100 times and produced 100 random WRKY gene pairs. The expression correlation of each of 100 random WRKY gene pair was 
calculated as described in the Methods 



Arabidopsis (genome size 125 Mb) and rice (genome 
size 480 Mb), in cucumber (genome size 367 Mb), the 
size of the WRKY family is small We further compared 
the number of WRKY genes in different subgroup 
among Arabidopsis, rice, grape and cucumber (Table 5). 
As showed in table 5, the key difference is that the 
number of group 3 CsWRKY genes (6) was much lesser 
than those of Arabidopsis (14) and rice (36). A problem 
has arisen. Whether CsWRKY genes, especially group 3 
CsWRKY genes, are underrepresented or not in our 
study? 

Complete and accurate annotation of genes is an 
essential starting point for further evolution and func- 
tion study in gene family. We identified a total of 55 
CsWRKY genes from 26682 cucumber annotated genes 
in cucumber genome. In addition, a total of 357882 
cucumber EST sequences download from Cucumber 
Genome DataBase and NCBI were used to test whether 
there are new WRKY proteins encoded by these EST 
sequences that were ignored in our annotation for 
CsWRKY proteins. The amino acid sequences of the 



open reading frame (ORF) of the EST were subjected to 
HMM program search. The results were screened 
manually for false positives at E values above 10 100 . 
Even with this weak criterion, we failed to find any new 
WRKY proteins in cucumber genome, which indicate 
that the annotation for cucumber WRKY genes is com- 
plete. We further used experimental methods to test the 
accuracy of annotation for CsWRKY genes. According to 
the annotated WRKY genes sequence, we detected the 
expression of 48 CsWRKY genes (87%), indicating that 
the accuracy of annotation for CsWRKY genes is high. 
Moreover, we cloned and sequenced full-length cDNAs 
of 32 of the annotated CsWRKY genes (Table 1), and 
some annotation errors were corrected. For example, we 
found that predicted CsWRKYlS and CsWRKY16 were 
actually two domains of one WRKY protein. Through 
this process, the integrity and accuracy of annotated 
CsWRKY genes were improved and were high enough 
to use in our further study. Therefore, we believed that 
CsWRKY genes would not be underrepresented in our 
study. 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 2 of 20 



o 
E 

05 
> 



< 

B 



cold stress R=0.87 




salt stress R=0.75 




dry stress R 





B 

cold stress R=0.81 






-CsWRKY50 
-AtWRKY46 



dry stress R=0.03 



—A— CsWRKY21 
— #— AtWRKY40 

salt stress R=-0.35 




0 0.5 1 3 6 12 24 0 0.5 1 3 6 12 24 

Figure 7 Pairwise comparisons of the expression profiles of putative orthologous cucumber and Arabidopsis WRKY genes under 
abiotic stresses. The relative expression of CsWRKY genes was obtained by real-time RT-PCR (indicated by triangles). Data are the means of 
three replicates with standard errors represented by bars. The CsWRKY expression data were compared with the mean-normalized expression 
data for their putative orthologous AtWRKY genes from a publicly available Arabidopsis microarray data set (indicated by circles) according to 
the description in Methods. The relative amount of mRNA (y-axis) was the ratio of treated to untreated sample. The treatment time (h) under 
the particular abiotic stress is presented on the x-axis. R indicates the correlation coefficient for expression between orthologs under the 
corresponding abiotic stresses. A distinct positive correlation was detected in most orthologs (A-D), but no obvious correlation was detected in 
group 3 orthologs (E-F). A negative correlation was detected in a small number of orthologs (G-H). 



The quickly expansion of group 3 WRKY genes is 
associated with the recent duplication events 

Many angiosperms underwent whole genome duplica- 
tion events (y, p, a). The y event appears to pre-data 
monocots-dicots divergence. The P event pre-dated Ara- 
bidopsis divergence from the other dicots, but post- 
dated divergence from the monocots about 170-235 
Myr ago. The a duplication event (recent duplication 
events) pre-dated Arabidopsis divergence from Brassica 
about 14.5-20.4 million years (Myr) ago [45]. The recent 
gene duplication events are most important in the 



quickly expansion and evolution of gene families [46]. 
Therefore, in our manuscript, we only analyze the influ- 
ence of recent duplication events to CsWRKY genes. 

Both Arabidopsis and rice genome underwent the 
recent duplication events, which lead to the large-scale 
expansion of gene family in their genome [46,47]. Zhang 
et al. report that group 3 WRKY domains appear to 
have been duplicated independently after the divergence 
of monocots and dicots (160 Mya) [44]. In this study, 
we further study the duplication of group 3 WRKY 
genes after divergence of the eurosids I group and the 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 3 of 20 



92 | — PtWRKYHO 
PtWRKY94 
GmWRKY51 

AtWRKY30 * 




GmWRKY55 
99 I GmWRKY52 
CsWRKY20 
PtWRKY84 

AtWRKY41 * 
MWRKY53 * 
CsWRKY31 
GmWRKY44 
PtWRKY105 
PtWRKY96 
CsWRKY50 
MWRKY46 * 



PtWRKY117 

- AtWRKY55 * 
CsWRKY35 

PtWRKY118 

PtWRKY123 

GmWRKY62 

CsWRKY34 

- AtWRKY70 * 
AtWRKY54* 

- CSWRKY22 
- PtWRKY124 




AtWRKY62 
AtWRKY38 



Figure 8 Phylogram of group 3 WRKY domains from 
Arabidopsis (AtWRKY), cucumber (CsWRKY), poplar (PtWRKY) 
and soybean {GmWRKY). The phylogenetic tree was constructed 
using the neighbor-joining method as implemented in PHYLIP 3.2. 
Numbers on internal nodes are the percentage bootstrap support 
values (1000 re-sampling); only values exceeding 50% are shown. 
The most primitive Giardia lamblia WRKY C-terminal domain 
{GIWRKY1Q was used as an outgroup. The letters T and S indicate 
nodes where tandem duplication and recent segmental duplication 
events have occurred, respectively. * indicates the AtWRKY 
associated with the gene duplication events. 



eurosids II group (110 Mya). As showed in Figure 7, the 
close paralogs WRKY genes of Arabidopsis, poplar and 
soybean each clustered together respectively, indicating 
that the expansion of the group 3 WRKY gene family 
may have occurred after the divergence of the eurosids I 
and eurosids II (110 Mya), and should be related to the 
most recent genome duplication events(24-40 Mya). 
Moreover, our result indicated that one of important 



A 



- AtWRKY53 

AtWRKY30 

AtWRKY55 

AtWRKY46 

— AtWRKY52 
AtWRKY54 



it3 



55L_J 

7D I 



- AtWRKY70 

AtWRKY62 

AtWRKY38 

- AtWRKY67 
- AtWRKY66 



B 



75 r 



- AtWRKY64 

- AtWRKY63 



-CsWRKY31 



- CSWRKY20 



51 la L 



- CSWRKY35 
CsWRKY22 



Figure 9 Phylogram of group 3 WRKY genes of Arabidopsis 
and cucumber. The phylograms were constructed using the 
neighbor-joining method as implemented in PHYLIP 3.2. Numbers 
on the left of each internal node represent bootstrap support values 
(1000 re-sampling); only values exceeding 50% are shown. Numbers 
on the right of each node represent the nodes that were used for 
positive selection analysis. Arabidopsis AtWRKY! was used as an 
outgroup. The trees represent phylogenetic relationships among (A) 
AtWRKY proteins and (B) CsWRKY proteins. 



factor in the expansion of group 3 AtWRKY was the 
occurrence of tandem duplication events. Four tandem 
duplication genes were clustered together in phyloge- 
netic trees, indicating that the tandem duplication 
occurred after the divergence of the eurosids I and euro- 
sids II and also related with recent duplication events. 
Interestingly, tandem duplication was an important 
recent gene duplication pattern in Arabidopsis genome 
[46], but in AtWRKY gene family there were only four 
AtWRKY genes from tandem duplication blocks and all 
of them belonged to group 3 AtWRKY genes. From 
these, we can see that the group 3 AtWRKY genes 
expanded quickly in Arabidopsis genome by two dupli- 
cation patterns: recent segmental duplication and recent 
tandem duplication, which indicate that group 3 WRKY 
genes may play important roles in the adaptability of 
angiosperms. 

As far as cucumber concerned, although Huang et al., 
reported that the cucumber genome was absence of 
recent whole-genome duplication events and tandem 
duplication [40]. The method of Schauser [43] was still 
used to detect whether recent small duplication blocks 
occur in CsWRKY family. We found no CsWRKY genes 
locus on any recent duplication blocks (additional file 2). 
In addition, from the Figure 1, we can see that there are 



Ling et al. BMC Genomics 201 1, 12:471 Page 14 of 20 

http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Table 4 Likelihood ratio test results of group 3 AtWRKY and CsWRKY. 



Group 3 AtWRKY 



Node 3 


dN/dS M0 b 


2AlnL M3 vs. M0 


2AlnL M8 vs. M7 


M8 estimates 0 


No. of positive selection sites d 


1 


0.5712 


1 70.69** 


1 7.76** 


co = 2.78 
P(p = 0.76 q = 1.17) 


7 


2 


0.5689 


36.21** 


6.92* 


co = 4.68 
P(p = 0.39 q = 0.34) 


10 


3 


0.3248 


141.78** 


8.37* 


co = 32.95 
P(p = 0.37 q = 0.26) 


5 


4 


0.6485 


54.62** 


9.97** 


co = 77.65 
P(p = 0.66 q = 1.05) 


11 


5 


0.2682 


169.06** 


10.66** 


co = 3.32 
P(p = 0.72 q = 0.78) 


9 


Group 3 CsWRKY 


Node 


dN/dS MO 


2AlnL M3 vs. M0 


2AlnL M8 vs. M7 


M8 estimates 


No. of positive selection sites 


1 


0.3331 


37.31** 


1 .40e-05 


co = 4.28 
P(p = 0.85 q = 1.39) 


0 


2 


0.3623 


83.01** 


8.80e-05 


co = 1 .00 
P(p = 0.72 q = 1.143) 


0 


3 


0.3081 


186.07** 


2.99e-05 


co = 24.88 
P(p = 0.60 q = 0.55) 


0 



Note: * p < 0.05 and ** p < 0.01 {yl test) 
a Node number from the phylogenetic tree 

b dN/dS is the average ratio over sites under a codon model with one ratio 

c co was estimated under model M8; p and q are the parameters of the beta distribution 

d The number of amino acid sites estimated to have undergone positive selection under M8 



no tandemly arrayed WRKY genes on the same chromo- 
somal location, which indicate the absence of recent tan- 
dem duplication event in CsWRKY genes. Therefore, 
compared with Arabidopsis and rice, the size of group 3 
CsWRKY proteins is small, which can be attributed to the 
absence of recent duplication events in cucumber gen- 
ome. To prove this hypothesis, we search the grape 
WRKY proteins (VvWRKY) in grape genome. The grape 
genome, like cucumber, has not undergone recent 
duplication events [48]. As showed by table 5, there are 
only five group 3 VvWRKY {GSVIVTO 10287 18001, 
GS VIVT01 01 95 1 1 001 , GS VIVT01 02 7069001 , 

GSVIVTO 1032662001 and GSVIVTO 103266 1001) can be 
detected in grape genome. Therefore, on the base of the 
above discussion, we believe that compared with Arabi- 
dopsis and rice, the small size of group 3 CsWRKY can be 
attribute to the absence of recent duplication events in 
cucumber genome rather than the underrepresentation 
of group 3 CsWRKY in our study. 



CsWRKY proteins play important roles in various 
biological processes 

The reported WRKY gene (SE71, ID: AAC37515.1) of 
cucumber shares 93% similarity with the CsWRKY37 
reported here. The expression of SE71 increases in coty- 
ledons as they expand and become photosynthetic, sug- 
gesting an involvement of SE71 in the development of 
cotyledons and cucumber photosynthesis [7]. Our RT- 
PCR results showed that CsWRKY37 was expressed in all 
seven cucumber tissues at relatively high levels, which 
indicates that CsWRKY37 could play a role not only in 
development of cotyledons and photosynthesis but also 
in the processes such as flower formation and fruit devel- 
opment. Besides CsWRKY37, some other CsWRKY genes 
also showed relative high expression levels in all seven 
organs, such as CsWRKY2S and CsWRKY49. The WRKY 
genes that are highly expressed in plant organs often play 
key roles in plant development [18]. The role of WKRY 
gene in plant development is in transcriptional regulation 



Table 5 The number of WRKY in cucumber, Arabidopsis, grape and rice 




Groupl 


Group2a 


Group2b 


Group2c 


Group2d 


Group2e 


Group3 


CsWRKY 


10 


4 


4 


16 


8 


7 


6 


AtWRKY 


13 


4 


7 


18 


7 


9 


14 


VvWRKY* 


12 


4 


7 


14 


6 


7 


5 


OsWRKY 


15 


4 


8 


15 


7 


11 


36 



Note: * the WRKY proteins of grape {Vitis vinifera) 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 5 of 20 



of expression of target genes that are involved in some 
physiological pathway [3]. So, we speculated that the 
highly expressed CsWRKY genes reported here may play 
a regulatory role in cucumber development. However, 
more research is needed to determine the functions of 
the CsWRKY genes. 

Evidence is accumulating that WRKY proteins are 
involved into response to various abiotic stresses. At 
least 54 OsWRKY genes of rice and 26 GmWRKY genes 
of soybean were found to be differentially expressed 
under abiotic stresses [18]. In this study, we showed 
that 23 CsWRKY genes exhibited differential expression 
in response to at least one abiotic stress, indicating that 
CsWRKY genes may play an important role in cucumber 
responding to abiotic stresses. In fact, previous studies 
indicated that some of the WRKY proteins are stable 
and resistant to environmental stresses. Huang et al. 
reported that a WRKY gene of bittersweet nightshade 
(STHP-64) encoded an anti-freeze protein, which con- 
tains a unique 13-mer repeat in the C-terminus, known 
to be a common feature of animal antifreeze proteins 
[9]. However, increasing number of studies indicate that 
WRKY proteins are transcriptional factors that regulate 
the tolerance of plant to abiotic stresses [38]. As shown 
in Figure 6, some of the CsWRKY genes responded to 
stresses at an early stage. For example, CsWRKY18 
peaked at 0.5 h after drought treatment. These results 
indicated that some CsWRKY genes possible may be as 
a transcriptional factor to regulate the tolerance of 
cucumber to stresses. To understand the biological 
functions of WRKY transcriptional factors, the identifi- 
cation of target genes and the regulatory network of 
WRKY transcriptional factors are necessary. The soy- 
bean GmWRKYS4 expressed in transgenic Arabidopsis 
showed that GmWRKYS4 can regulate the expression of 
DREB2A, which contains a W-box motif in the promo- 
ter region and is known to act as a transcriptional factor 
regulated the expression of many drought-inducible 
genes [15]. Other recent studies have revealed that two 
co-regulated networks exist in rice regulating the 
response to various abiotic stresses [49]. These results 
indicate that the regulatory role of WRKY proteins 
under abiotic stresses is complex and more work is 
needed to understand the regulatory mechanisms. 

The functional conservative and divergence of 
orthologous genes between Arabidopsis and cucumber 

In comparative genomics, the clustering of orthologous 
genes highlights the divergence and conservation of 
gene families among multiple genomes. Two strategies 
have often been used to identify orthologs or paralogs: 
phylogeny-based methods and BLAST-based methods 
[50]. The comparison of results from phylogeny-based 
methods contains widely orthologous pairs information 



but may lead to false positives error [51]. Therefore 
strict criteria must be adopted in phylogeny-based meth- 
ods. BLAST-based method (Bi-direction best hit) shows 
a good overall performance but is restricted to 1:1 
orthologs which may lead to omit the in-paralogs [51]. 
In this study, a rooted phylogenetic tree based on 
WRKY domain of rice, cucumber and Arabidopsis was 
used to arrange possible orthologs of cucumber and 
Arabidopsis. In addition, a standard approach BBH 
(bidirectional best hit) was also used as reference to 
arrange possible orthologs. Relatively strict criteria were 
used to arrange orthologus genes in this study. The 
nodes of phylogenetic tree which the bootstrap support 
values (1000 re-sampling) exceed 50% were used to 
identify possible orthologs pairs. For example, 
AtWRKY65 and CsWRKY6 were clustered together in 
phylogenetic tree, but the bootstrap of their node is no 
more than 50%. Therefore, AtWRKY6S and CsWRKY6 
were excluded from the orthologous pair, so does 
CsWRKYll and AtWRKY18/60. In addition, the mem- 
bers of group 1 WRKY were considered as possible 
orthologous pairs unless the same phylogenetic relation- 
ship can be detected between their N-domain and C- 
domain in the phylogenetic tree. For example, 
CsWRKY8 and AtWRKY25 126 were excluded from 
orthologous pairs because of the different cluster of 
their N-domain and C-domain in the phylogenetic tree. 
Totally, we found 38 orthologus pair between cucumber 
and Arabidopsis (additional file 2). 

We further analyze the correlation of orthologous 
pairs under abiotic stresses. Our results show that corre- 
lative expression profiles in stress-inducible orthologous 
WRKY genes between cucumber and Arabidopsis. Man- 
gelsen et al. reported that in homologous organs the 
average correlation coefficient of the orthologous 
WRKY genes between monocots and dicots can reach 
0.24 [52]. Because researches on the role played by 
cucumber genes in abiotic stress tolerance are quite lim- 
ited, our study provide a new starting point for investi- 
gating the function of cucumber genes by comparing 
the orthologous genes between cucumber and Arabidop- 
sis. Furthermore, in our study, orthologous WRKY genes 
with different evolution patterns displayed a low correla- 
tion in their expression patterns. Almost half of 
CsWRKY genes in our study responded to at least one 
abiotic stresses, but none of them belongs to group 3. In 
contrast, the expression data from microarray of 
AtWRKY genes has revealed that all the gene ortholo- 
gous to group 3 CsWRKY genes response to abiotic 
stresses in Arabidopsis, and interestingly all of them are 
located in a recent segmentally duplicated region. The 
recent Segmental duplication occurs most frequently in 
plants because most plants are diploidized polyploids 
and retain numerous duplicated chromosomal blocks in 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 6 of 20 



their genomes [53]. As discussed earlier in this paper, 
after the divergence of eurosids I and eurosids II, the 
group 3 AtWRKY genes experienced segmental duplica- 
tion events. The long-term evolutionary fate of duplica- 
tion genes will be determined by functions of the 
duplicated genes. Four types of functional differentiation 
may follow by gene duplication: pseudogenization, con- 
servation of gene function, subfunctionalization and 
neofunctionalization [54]. Many duplicated genes may 
be lost from the genome after the duplication events, 
and neofunctionalization and subfunctionalization are 
the major factors for the retention of new genes. In 
addition, positive selection may play important roles in 
the neofunctionalization and subfunctionalization of 
duplication genes. In the case of neofunctionalization of 
duplicated genes, positive selection accelerates the fixa- 
tion of advantageous mutations that enhance the activity 
of the novel function. In the case of subfunctionalization 
of duplicated genes, each daughter gene will inherit one 
of functions of ancestral gene, and further substitutions 
under positive selection can refine the functions [47]. In 
Arabidopsis, the number of group 3 WRKY genes 
increased significantly due to the duplication events 
after divergence of the eurosids I and eurosids II, and 
our results suggested that all duplicated group 3 
AtWRKY experienced a positive selection after their 
duplication events. The retention of new members of 
group 3 AtWRKY could be contributed to their neofunc- 
tionalization. In rice, high expression divergence could 
be one of the mechanisms for the retention of dupli- 
cated WRKY genes [18]. Due to the lack of gene dupli- 
cation events in the Cs WRKY family, the functions of 
group 3 CsWRKY genes are probably more conservative 
than that of AtWRKY. The functions of the group 3 
CsWRKY genes likely resemble the functions of a com- 
mon ancestor that existed before the divergence of euro- 
sids I and II. Indeed, the common ancestor may not 
have been responsive to abiotic stresses, and the stress- 
responsive ability of the group 3 AtWRKY genes could 
be due to neofunctionalization following gene duplica- 
tion event (s). 

Conclusions 

In this study, we identified a total of 55 cucumber 
WRKY genes and analyzed the expression profile of 48 
CsWRKY genes under normal growth conditions and in 
response to various abiotic stresses. These new WRKY 
sequences and expression information reported here will 
be useful for further investigating the function of WRKY 
genes under various stress conditions. Although the 
genome sequence of cucumber has been reported, func- 
tional studies on cucumber genes are still lag behind. 
Our results show that correlative expression profiles 
exist between putative WRKY orthologs of cucumber 



and Arabidopsis. Hence, comparative genomics 
approaches could be used to investigate gene function. 
In addition, compared with group 1 and 2 WRKY genes, 
the group 3 WRKY genes seem to have arisen more 
recently in angiosperms, but have expanded rapidly. Our 
results also indicate that positive selection could have 
led to the functional divergence of duplicated genes dur- 
ing the expansion of group 3 WRKY genes. Based on all 
the results presented here, we speculated that the func- 
tional divergence of WRKY proteins has played a critical 
role in the responses of plants to various stresses. 

Methods 

Sequence database searches 

Arabidopsis WRKY proteins sequences were obtained 
from TAIR [55]. The rice WRKY proteins sequences 
were obtained from rice genome annotation project 
[56]. The WRKY proteins of poplar and soybean were 
obtained from PFAM database [57]. The GenBank 
accession numbers of WRKY protein sequences were 
provided in additional file 7. The WRKY proteins of 
grape were obtained from http://www.genoscope.cns.fr/ 
externe/Download/Projets/Projet_ML/data/12X/annota- 
tion/Vitis_vinifera_peptide.fa.gz. 

The cucumber annotated (predicted) genes and pro- 
teins were obtained from Cucumber Genome Sequen- 
cing Project which we participated in. Now, this 
annotated data can be downloaded from Cucumber 
Genome DataBase [58]. We searched WRKY proteins 
from a total of 26682 predicted cucumber proteins. We 
used 72 Arabidopsis WRKY proteins as query sequences 
and Blastp searches against the predicted cucumber pro- 
teins. The sequences were selected as candidate proteins 
if their E value satisfied E was <-10. Based on the 
HMMER User's Guide, the Hmmsearch program was 
then used to predict the WRKY domains (PF03106.7) of 
all these candidate proteins and the E valve was set to 
-10. The new WRKY-like sequences confirmed by 
Hmmsearch in the cucumber genome were in turn used 
reiteratively to search the cucumber predicted proteins 
until no new sequences were found. The EST sequences 
of cucumber were downloaded from NCBI and Cucum- 
ber Genome DataBase [58]. 

Multiple sequence alignment, gene structure construction 
and phylogenetic analysis 

The 60 amino acid spanning WRKY core domain of all 
CsWRKY proteins and selected AtWRKY protein 
(AtWRKY20 (At4g26640), 40 (Atlg80840), 72 
(At5gl5130), 50 (At5g26170), 74 (At5g28650), 65 
(Atlg29280) and 54 (At2g40750)) was used to create 
multiple protein sequence alignments using ClustalW 
[59]. Default settings were applied for the alignment in 
Figure 2. The gene structure was obtained by the 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 7 of 20 



cucumber gene annotation GIFF3 file downloaded from 
Cucumber Genome DataBase. The neighbor-joining 
method was used to construct the phylogenetic tree 
based on amino acid sequence of WRKY domains. Two 
types of software, MEGA 4.0 and PHYLIP 3.2 were used 
[60,61]. The MEGA 4.0 analysis was carried out accord- 
ing to the description by Zhang et al., [62] and the 
PHYLIP 3.2 analysis was carried out according to the 
description by Zhou et al., [15]. Motif detection was 
performed with MEME 4.0 software [63]. A rooted phy- 
logenetic tree based on WRKY domain of rice, cucum- 
ber and Arabidopsis was used to arrange possible 
orthologs of cucumber and Arabidopsis. In addition, a 
standard approach BBH (bidirectional best hit) was also 
used as reference to arrange possible orthologs [51,64]. 

Microarray based expression analysis and correlation 
calculation 

For the expression analysis of AtWRKY genes, publicly 
available microarray data of the AtGenExpress global 
stress expression data set [37] were used. The microar- 
ray data of cold stress (ME00325), drought stresses 
(ME00338) and salt stresses (ME00328) were down- 
loaded from Weigel World database [65]. The mean- 
normalized values of the expression data were used in 
further analysis. The relative amount of mRNA was cal- 
culated by dividing the expression data of the stress 
treatment by that of the control (0 h treatment). 

Available expression data on AtWRKY genes from 
microarray analysis and that of CsWRKY genes generated 
by real time RT-PCR analysis described here were used to 
calculate the Pearson correlation of the expression of 
orthologous WRKY genes. All expression data (relative 
amount of mRNA) are composed of seven treatment 
points (0, 0.5, 1, 3, 6, 12, and 24 h) under corresponding 
abiotic stresses. For each of orthologous WRKY gene 
pairs, the correlation of the expression data under their 
corresponding abiotic stresses was calculated. The follow- 
ing methods were used to test the significance of correla- 
tion of the expression of orthologs pair: A randomly 
chosen abiotic stress induced cucumber WRKY genes and 
a randomly chosen abiotic stress induced AtWRKY gene 
constituted a random WRKY gene pair. This process was 
repeated a 100 times and produced 100 random WRKY 
gene pairs. The expression correlation of each of 100 ran- 
dom WRKY gene pair was calculated as described above. 
Lastly, the average correlation of orthologous WRKY gene 
pairs and of randomly selected gene pairs was calculated. 
Student's t-test was used to obtain the statistical signifi- 
cance of the difference in average correlation of the two 
datasets. The random WRKY genes pairs were obtained 
using Perl scripts. Pearson correlation and P-values in t- 
test were calculated by using software R. All programs run 
on a computer with Ubuntu Linux installed. 



Detection of positive selection 

The Amino acid sequence of group 3 AtWRKY and 
CsWRKY proteins were used to construct phylogenetic 
tree respectively, which in turn was used for detecting 
positive selection. We used PAML4 [66] to analyze 
codon substitution patterns with a maximum likelihood, 
implementing a site-specific model. We detected varia- 
tion in co values among sites by employing a likelihood 
ratio test (LRT) between M0 vs. M3 and M7 vs. M8 
according to Yang et al. [67]. The nodes were consid- 
ered to have undergone positive selection, if they satis- 
fied the following criteria: (1) an estimate of co > 1 
under M8 (2) sites identified to be under positive selec- 
tion by Bayes Empirical Bayes (BEB) analysis and (3) a 
statistically significant LRT. 

Plant materials, growth conditions and treatments 

Line 9930, a cucumber typical of northern China, was 
used throughout the study. Seeds were germinated in 
pots containing vermiculite, and 3-week old seedlings 
were used in the following treatments. For dehydration 
treatment, the plants were carefully pulled out, trans- 
ferred on to filter paper and allowed to dry. For salinity 
and cold treatments, seedlings were subjected to a 100 
mM NaCl solution or incubated at 4°C, respectively. 
Above-ground samples for RNA extractions were col- 
lected at 0, 0.5, 1, 3, 6, 12 and 24 h after treatment. The 
roots, stems, leaves, cotyledons of seedlings, female flow- 
ers, male flowers and fruits of mature plants were col- 
lected separately for RNA isolation and used for tissue- 
specific expression analysis. 

RNA isolation, clone full-length cDNA, RT-PCR and Real 
-time PCR analysis 

Total RNA was isolated according to Zhang et al, [59]. 
For cloning the full-length cDNA of CsWRKY genes, we 
first used the EST sequences of cucumber to correct the 
annotated CsWRKY sequence and then used the Fge- 
nesh, a web-base gene prediction method, as a tool to 
re-annotate all 57 WRKY genes. Subsequently, com- 
bined the result of Fgenesh, GLEAN and EVM (GLEAN 
and EVM were employed to annotate cucumber genome 
in cucumber genome project), we amplified the full- 
length sequence of CsWRKY coding region (CDS) genes 
by PCR. 

For RT-PCR, the specific primers were designed 
according to the WRKY gene sequences by Primer 5 
software (additional file 8). A cucumber f5-actin gene 
(ID: Csa017310), amplified with primers 5'-TCCACGA- 
GACTACCTACAACTC-3' and 5'-GCTCATACGGT- 
CAGCGAT-3', was used as a control. The following 
program was used for RT-PCR: 94 for 2 min followed 
by 35 cycles at 94 for 10 s, 55-59 for 10 s and 72 for 25 
s, followed by a 2 min extension step at 72. While the 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 8 of 20 



number of cycles of PCR for actin gene was set as 23. 
The PCR products were separated on an agarose gel 
and quantified using an Imaging System (Bio-Rad, 
USA). The experiments were repeated three times with 
independent RNA samples. 

The real-time PCR analysis were performed using 
BIO-RAD CFX96 real-Time PCR system(Bio-Rad, USA) 
96 well formats with denaturation at 95°C for 3 min, fol- 
lowed by 40 cycles of denaturation at 95°C for 10 s and 
annealing/extension at 55 or 60°C for 1 min. Three bio- 
logical replicates were carried out and triplicate quanti- 
tative assays for each replicate were performed on 0.5 [d 
of each cDNA dilution using TianGen SYBR Green PCR 
Master mix kit (TianGen Biotech FP202, CHN) accord- 
ing to the manufacturer's protocol. The cucumber /3- 
actin gene was used as an internal control. Relative gene 
expression was calculated according to Jiang et al, [68]. 
The ACT and AACT were calculated by the formulas 
ACT = CT target - CT reference and AACT = ACT 
treated sample -ACT untreated sample (0 h treatment). 
The RNA relative amount as selected to evaluate gene 
expression level as 2-AACT, which was used for all 
chart preparations. At the same time, the standard 
errors of mean among replicates were calculated. All 
calculations were automatically carried on Bio-Rad CFX 
Manager (Versionl.5.534) of BIO-RAD CFX96. Stu- 
dent's t-test was used to obtain the statistical signifi- 
cance of the difference between treated samples and 
untreated samples (0 h treatment under abiotic stress). 
If P-values < 0.01, we considered the WRKY genes as 
differential expressed genes. The specific primers were 
designed for WRKY genes and fi-actin gene used in real 
time PCR were listed in additional file 9. The data and 
pictures produced by BIO-RAD CFX96 were presented 
in additional file 10 and additional file 11, respectively. 

Additional material 



Additional file 4: The schematic diagram of motifs of WRKY 
proteins. The schematic diagram was deserved from Meme 4.0 software. 
The order of motifs of WRKY proteins in the diagram was automatically 
generated by Meme software according to scores. 

Additional file 5: Comparison of expression pattern of orthologous 
WRKY pairs under various abiotic stresses. Available expression data 
on AtWRKY genes from microarray analysis and that of CsWRKY genes 
generated by real-time PCR analysis were compared. 

Additional file 6: The expression data for calculating the correlation 
of orthologs under abiotic stresses. Expression data of Arabidopsis 
from microarray and of cucumber from Real-time RT-PCR analysis were 
used to calculate the Pearson correlation of the expression of 
orthologous WRKY genes pairs under various abiotic stress (at 0, 0.5, 1, 3, 
6, 12 and 24 h treatment). 

Additional file 7: The GenBank accession numbers of WRKY protein 
sequences used in the manuscript. GenBank accession numbers of 
WRKY protein were from NCBI or PFAM database. 

Additional file 8: The primer sequences used for RT-PCR 
amplification of 48 CsWRKY genes. The specific primers were designed 
according to the WRKY gene sequences by Primer 5 software. 

Additional file 9: The primer sequences used for real-time PCR of 
stress-responsive and group 3 CsWRKY genes. The specific primers 
were designed according to the WRKY gene sequences by Primer 5 
software. 

Additional file 10: The expression patterns of stress-inducible 
CsWRKY genes were shown by real-time PCR analyses under three 
different abiotic stresses. Expression of stress-inducible Cs WRKY genes 
were shown by real-time PCR analyses under three different abiotic 
stresses. The pictures of the first column, the second column and the 
third column indicated the expression pattern under cold treatment, 
drought treatment and salt treatment respectively. For each picture, the 
y-axis indicated the relative fold of treatment to control and x-axis 
indicate the time under treatment. (A),CsWRKY2; (B), CsWRKY 18; (C), 
CsWRKY21; (D),CsWRKY40; (E),CsWRKY46. This is the originally pictures 
produced by Bio-Rad CFX manager software automatically. 

Additional file 11: The Ct-values and standard deviation for the real 
time RT-PCR of CsWRKY genes. The Ct-value and standard deviation of 
CsWRKY genes and their corresponding actin control under different 
treatments. 



List of abbreviations 

RT-PCR: reverse transcription PCR; TF: transcription factor; WDs: WRKY 
domains; ML: Maximum likelihood; NJ: neighbor-joining; dS: the rate of 
synonymous substitutions; dN: the rate of non-synonymous substitutions. 

Acknowledgements 

This work was supported by the National Natural Science Foundation of 
China (N0.31 030057); the National Key Basic Research and Development 
Program of China [grant no.2009CB1 19000]; the National Natural Science 
Foundation of China (N0.3 1000922); the earmarked fund for Modern Agro- 
industry Technology Research System; Key Laboratory of Horticultural Crop 
Biology and Germplasm Innovation, Ministry of Agriculture. We also 
appreciate Dr. Zhonghua Zhang for his great technical assistance. 

Authors' contributions 

JL contributed to RNA extraction, RT-PCR, real-time PCR, bioinformatics 
analysis and writing of the manuscript. YZ and ZCM helped with the RNA 
extraction, RNA extraction, RT-PCR, and real-time PCR. HJY contributed to the 
discussion of the evolution pattern of WRKY genes. XFG and SWH 
contributed to the discussion and calculation of positive selection of WRKY 
genes. WJJ and BYX designed the experiments and contributed to revisions 
of the manuscript. All authors read and approved the final manuscript. 

Received: 13 May 2010 Accepted: 28 September 2011 
Published: 28 September 201 1 



Additional file 1: A rooted phylogenetic tree representing 
relationships among WRKY domains of rice, cucumber and 

Arabidopsis. The amino acid sequences of the WRKY domain of rice 
WRKY (OsWRKY), CsWRKY and AtWRKY proteins were used to reconstruct 
a phylogenetic tree. The most primitive Giardia lamblia WRKY C-terminal 
domain (GIWRKY1Q was used as an outgroup. Group 1 proteins with the 
suffix 'N' or 'C indicates the N-terminal WRKY domains or the C-terminal 
WRKY domains. Stars and black lines represent orthologous WRKY of 
cucumber and Arabidopsis. The tree was constructed by PHYLIP 3.2 and 
displayed by njplot software. 

Additional file 2: putative orthologs of cucumber and Arabidopsis. 

Identified WRKY proteins in cucumber and their putative orthologs in 
Arabidopsis based on phylogenetic studies of WRKY domain sequences. 

Additional file 3: Amino acid motif analysis of CsWRKY proteins 
from different groups (or subgroups) and selected group 3 AtWRKY 
proteins. Motif analysis was performed using Meme 4.0 software. The 
schematic diagram was obtained by Perl-SVG script and edited in 
photoshop 7.0. 



Ling et al. BMC Genomics 201 1, 12:471 
http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



Page 1 9 of 20 



References 

1. Eulgem T, Rushton PJ, Robatzek S, Somssich IE: The WRKY superfamily of 
plant transcription factors. Trends Plant Sci 2000, 5:199-206. 

2. Rushton PJ, Macdonald H, Huttly AK, Lazarus CM, Hooley R: Members of a 
new family of DNA-binding proteins bind to a conserved cis-element in 
the promoters of a-Amy2 genes. Plant Mol Biol 1995, 29:691-702. 

3. Sun C, Palmqvist S, Olsson H, Boren M, Ahlandsberg S, Jansson C: A novel 
WRKY transcription factor, SUSIBA2, participates in sugar signaling in 
barley by binding to the sugarresponsive elements of the isol 
promoter. Plant Cell 2003, 15:2076-2092. 

4. Kazuhiko Y, Takanori K, Makoto I, Masaru T, Tomoko Y, Takashi Y, Masaaki A, 
Eiko S, Takayoshi M, Yasuko T, Nobuhiro H, Takaho T, Mikako S, Akiko T, 
Motoaki S, Kazuo S, Shigeyuki Y: Solution Structure of an Arobidopsis 
WRKY DNA Binding Domain. Plant Cell 2005, 17:944-956. 

5. Ishiguro S, Nakamura K: Characterization of a cDNA encoding a novel 
DNA-binding protein, SPF1, that recognizes SP8 sequences in the 59 
upstream regions of genes coding for sporamin and b-amylase from 
sweet potato. Mol Gen Genet 1994, 244:563-571. 

6. Rushton PJ, Torres JT, Parniske M, Wernert P, Hahlbrock K, Somssich IE: 
Interaction of elicitor-induced DNA-binding proteins with elicitor 
response elements in the promoters of parsley PR1 genes. EMBO J 1996, 
15:5690-5700. 

7. Kim DJ, Smith SM, Leaver CJ: A cDNA encoding a putative SPF1-type 
DNA-binding protein from cucumber. Gene 1997, 185:265-269. 

8. Dellagi A, Heilbronn J, Avrova A, Montesano M, Palva ET, Stewart HE, 
Toth IK, Cooke D, Lyon G, Birch P: A potato gene encoding a WRKY-like 
transcription factor is induced in interactions with Erwinia carotovora 
subsp atroseptica and Phytophthora infestans and is coregulated with 
class I endochitinase expression. Mol Plant-Microbe Interact 2000, 
13:1092-1101. 

9. Huang T, Duman JG: Cloning and characterization of a thermal hysteresis 
(antifreeze) protein with DNA-binding activity from winter bittersweet 
nightshade, Solarium dulcamara. Plant Mol Biol 2002, 48:339-350. 

10. Pnueli L, Hallak HE, Rozenberg M, Cohen M, Goloubinoff P, Kaplan A, 
Mittler R: Molecular and biochemical mechanisms associated with 
dormancy and drought tolerance in the desert legume Retama raetam. 
Plant J 2002, 31:319-330. 

11. Ulker B, Somssich IE: WRKY transcription factors: from DNA binding 
towards biological function. Curr Opin Plant Biol 2004, 7:491-498. 

12. Mantri NL, Ford R, Coram TE, Pang EC: Transcriptional profiling of 
chickpea genes differentially regulated in response to highsalinity, cold 
and drought. BMC Genomics 2007, 8:303. 

13. Kato N, Dubouzet E, Kokabu Y, Yoshida S, Taniguchi Y, Dubouzet JG, 
Yazaki K, Sato F: Identification of a WRKY protein as a transcriptional 
regulator of benzylisoquinoline alkaloid biosynthesis in Coptis japonica. 
Plant Cell Physiol 2007, 48:8-18. 

14. Marchive C, Mzid R, Deluc L, Barrieu F, Pirrello J, Gauthier A, Corio-Costet A, 
Regad F, Cailleteau B, Hamdi S, Lauvergeat V: Isolation and characterization 
of a Vitis vinifera transcription factor, WWRKY1, and its effect on 
responses to fungal pathogens in transgenic tobacco plants. J Exp Bot 

2007, 58:1999-2010. 

15. Zhou QY, Tian AG, Zou HF, Xie ZM, Lei G, Huang J, Wang CM, Wang HW, 
Zhang JS, Chen SY: Soybean WRKY-type transcription factor genes, 
GmWRKY13, GmWRKY21, and GmWRKY54, confer differential tolerance 
to abiotic stress in transgenic Arabidopsis plants. Plant Biotechnol J 2008, 
6:486-503. 

16. Liu JJ, Ekramoddoullah AK: Identification and characterization of the 
WRKY transcription factor family in Pinus monticola. Genome 2009, 
52:77-88. 

17. Wu KL, Guo ZJ, Wang HH, Li J: The WRKY family of transcription factors in 
rice and Arabidopsis and their origins. DNA Research 2005, 12:9-26. 

18. Ramamoorthy R, Jiang SY, Kumar N, Venkatesh PN, Ramachandran S: A 
comprehensive transcriptional profiling of the WRKY gene family in rice 
under various abiotic and phytohormone treatments. Plant Cell Physiol 

2008, 49:865-879. 

19. Dong J, Chen C, Chen Z: Expression profiles of the Arabidopsis WRKY 
gene superfamily during plant defense response. Plant Mol Biol 2003, 

51:21-37. 

20. Xu X, Chen C, Fan B, Chen Z: Physical and functional interactions 
between pathogen-induced Arabidopsis WRKY18, WRKY40, and WRKY60 
transcription factors. Plant Cell 2006, 18:1310-1326. 



21. Li J, Brader G, Kariola T, Palva T: WRKY70 modulates the selection of 
signaling pathways in plant defense. Plant J 2006, 46:477-491. 

22. Oh SK, Yi SY, Yu SH, Moon JS, Park JM, Choi D: CaWRKY2, a chili pepper 
transcription factor, is rapidly induced by incompatible plant pathogens. 
Mol Cells 2006, 22:58-64. 

23. Zheng Z, Mosher SL, Fan B, Klessig DF, Chen Z: Functional analysis of 
Arabidopsis WRKY25 transcription factor in plant defense against 
Pseudomonas syringae. BMC Plant Biol 2007, 7:2. 

24. Zheng Z, Qamar SA, Chen Z, Mengiste T: Arabidopsis WRKY33 
transcription factor is required for resistance to necrotrophic fungal 
pathogens. Plant J 2006, 48:592-605. 

25. Beyer K, Binder A, Boiler T, Colling M: Identification of potato genes 
induced during colonization by Phytophthora infestans. Mol Plant Pathol 
2001, 2:125-134. 

26. Kalde M, Barth M, Somssich IE, Lippok B: Members of the Arabidopsis 
WRKY group III transcription factors are part of different plant defense 
signaling pathways. Mol Plant-Microbe Interact 2003, 16:295-305. 

27. Knoth C, Ringler J, Dangl JL, Eulgem T: Arabidopsis WRKY70 is required for 
full RPP4-mediated disease resistance and basal defense against 
Hyaloperonospora parasitica. Mol Plant-Microbe Interact 2007, 20:120-128. 

28. Johnson SC, Kolevski B, Smyth DR: Transparent testa glabra2, a trichome 
and seed coat development gene of Arabidopsis, encodes a WRKY 
transcription factor. Plant Cell 2002, 14:1359-1375. 

29. Lagace M, Matton DP: Characterization of a WRKY transcription factor 
expressed in late torpedo-stage embryos of Solanum chacoense. Planta 
2004, 219:185-189. 

30. Robatzek S, Somssich IE: Targets of AtWRKY6 regulation during plant 
senescence and pathogen defense. Genes Dev 2002, 16:1 139-1 149. 

31. Zhang ZL, Xie Z, Zou X, Casaretto J, David TH, Zhen QJ: A rice WRKY gene 
encodes a transcriptional repressor of the gibberellin signaling pathway 
in aleurone cells. Plant Physiol 2004, 134:1500-1513. 

32. Zou X, Seemann JR, Neuman D, Shen QJ: A WRKY gene from creosote 
bush encodes an activator of the abscisic acid signaling pathway. J Biol 
Chem 2004, 279:55770-55779. 

33. Xie Z, Zhang ZL, Zou X, Yang G, Komatsu S, Shen QJ: Interactions of two 
abscisic-acid induced WRKY genes in repressing gibberellin signaling in 
aleurone cells. Plant J 2006, 46:231-242. 

34. Du L, Chen Z: Identification of genes encoding receptorlike protein 
kinases as possible targets of pathogen- and salicylic acid-induced 
WRKY DNA-binding proteins in Arabidopsis. Plant J 2002, 24:837-847. 

35. Karam BS, Rhonda CF, Luis OS: Transcription factors in plant defense and 
stress response. Curr Opin Plant Biol 2002, 5:430-436. 

36. Motoaki S, Mari NJ, Ishida TN, Miki F, Youko O, Asako K, Maiko N, Akiko E, 
Tetsuya S, Masakazu S, Kenji A, Teruaki T, Kazuko YS, Piero C, Jun K, 
Yoshihide H, Kazuo S: Monitoring the expression profiles of 7000 
Arabidopsis genes under drought, cold and high-salinity stresses using a 
full-length cDNA microarray. Plant J 2002, 31:279-292. 

37. Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D'Angelo C, 
Bornberg-Bauer E, Kudla J, Harter K: The AtGenExpress global stress 
expression data set: protocols, evaluation and model data analysis of 
UV-B light, drought and cold stress responses. Plant J 2007, 50:347-363. 

38. Mare C, Mazzucotelli E, Crosatti C, Francia E, Stanca AM, Cattivelli L: Hv- 
WRKY38: a new transcription factor involved in cold- and drought- 
response in barley. Plant Mol Biol 2004, 55:399-416. 

39. Liu SQ, Xu L, Jia ZQ, Xu Y, Yang Q, Fei ZJ, Lu XY, Chen HM, Huang SW: 
Genetic association of ETHYLENE-INSENSITIVE3-like sequence with the 
sex-determining M locus in cucumber {Cucumis sativus L). Theor Appl 
Genet 2004, 117:927-933. 

40. Huang SW, Li RQ, Zhang ZH, Li L, Gu XF, Fan W, Lucas WJ, Wang XW, 
Xie BY, Ni PX, Ren YY, Zhu HM, Li J, Lin K, Jin WW, Fei ZJ, Li GC, Staub J, 
Kilian A, Vossen EAGV, Wu Y, Guo J, He J, Jia ZQ, Ren Y, Tian G, Lu Y, 

Ruan J, Qian WB, Wang MW, et al: The genome of the cucumber, Cucumis 
sativus L. Nature Genetic 2009, 475:1-7. 

41. Ross CA, Liu Y, Shen QJ: The WRKY gene family in rice {Oryza sativa). 
J Integr Plant Biol 2007, 49:827-842. 

42. Rossberg M, Theres K, Acarkan A, Herrero R, Schmitt T, Schumacher K, 
Schmitz G, Schmidt R: Comparative sequence analysis reveals extensive 
microcolinearity in the lateral suppressor regions of the tomato, 
Arabidopsis, and Capsella genomes. Plant Cell 2001, 13:979-988. 

43. Schauser L, Wieloch W, Stougaard J: Evolution of NIN-like proteins in 
Arabidopsis, rice, and Lotus japonicus. J Mol Evol 2005, 60:229-237. 



Ling et al. BMC Genomics 201 1, 12:471 Page 20 of 20 

http://www.biomedcentral.eom/1 471 -21 64/1 2/471 



47. 



49. 

50. 
51. 

52. 



53. 

54. 

55. 
56. 

57. 
58. 

59. 

60. 

61. 
62. 

63. 
64. 

65. 
66. 
67. 



Zhang YJ, Wang LJ: The WRKY transcription factor superfamily: its origin 
in eukaryotes and expansion in plants. BMC Evolutionary Biology 2005, 5:1. 
Blanc G, Hokamp K, Wolfe KH: A recent polyploidy superimposed on 
older large-scale duplications in the Arabidopsis genome. Genome Res 
2003, 13:137-144. 

Cannon SB, Mitra A, Baumgarten A, Young ND, May G: The roles of 
segmental and tandem gene duplication in the evolution of large gene 
families in Arabidopsis thaliana. BMC Plant Biol 2004, 4:10. 
Taylor JS, Raes J: Duplication and divergence: The evolution of new 
genes and old ideas. Annu Rev Genet 2004, 38:615-643. 
Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, 
Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, 
Homer D, Mica E, Jublot D, Poulain J, Bruyere C, Billault A, Segurens B, 
Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, 
Alaux M, Di Gaspero G, Dumas V, et al: The grapevine genome sequence 
suggests ancestral hexaploidization in major angiosperm phyla. Nature 
2007, 449:463-467. 

S. Berri P, Abbruscato O, Faivre-Rampant AC, Brasileiro I, Fumasoni K, 
Satoh S, Kikuchi L, Mizzi P, Morandini ME, Pe P, Piffanelli P: Characterization 
of WRKY co-regulatory networks in rice and Arabidopsis. BMC Plant Biol 
2009, 9:120. 

Li L, Stoeckert CJ Jr, Roos DS: OrthoMCL: identification of ortholog groups 
for eukaryotic genomes. Genome Res 2003, 13:2178-2189. 
Chen F, Mackey AJ, Vermunt JK, Roos DS: Assessing performance of 
orthology detection strategies applied to eukaryotic genomes. PLoS ONE 
2007, 2:e383. 

Mangelsen E, Kilian J, Berendzen KW, Kolukisaoglu H, Harter K, Jansson C, 
Wanke D: Phylogenetic and comparative gene expression analysis of 
barley {Hordeum vulgare) WRKY transcription factor family reveals 
putatively retained functions between monocots and dicots. BMC 

Genomics 2008, 9:194. 

Blanc G, Wolfe KH: Functional divergence of duplicated genes formed by 
polyploidy during Arabidopsis evolution. Plant Cell 2004, 16:1679-1691. 
Zhang J: Evolution by gene duplication-an update. Trends Ecol Evol 2003, 
18:292-298. 

The Arabidopsis Information Resource (TAIR). [http://www.arabidopsis.org/]. 
Rice Genome Annotation Project, [http://rice.plantbiology.msu.edu/index. 
shtml]. 

The Pfam database of protein domains and HMMs. [http://pfam.jouy.inra.fr/]. 
Cucumber Genome DataBase, [http://cucumber.genomics.org.cn/page/ 
cucumber/index.jsp]. 

Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The 
CLUSTAL_X windows interface: flexible strategies for multiple sequence 
alignment aided by quality analysis tools. Nucleic Acids Res 1997, 
25:4876-4882. 

Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary 
Genetics Analysis (MEGA) software version 4.0. Molecular Biology and 
Evolution 2007, 24:1596-1599. 

Felsenstein J: PHYLIP-Phylogeny Inference Package (Version 3.2). 

Cladistics 1989, 5:164-166. 

Zhang GY, Chen M, Chen XP, Xu ZS, Guan S, Li LC, Li AL, Guo JM, Mao L, 
Ma YZ: Phylogeny, gene structures, and expression patterns of the ERF 
gene family in soybean {Glycine max L). J Exp Bot 2008, 59:4095-4107. 
Bailey TL, Williams N, Misleh C, Li WW: MEME: discovering and analyzing 
DNA and protein sequence motifs. Nucleic Acids Res 2006, 34:W369-W373. 
Jorge I, Ribichich FKarina, Dezar ACarlos, Chan LRaquel: Expression 
analyses indicate the involvement of sunflower WRKY transcription 
factors in stress responses, and phylogenetic reconstructions reveal the 
existence of a novel clade in the Asteraceae. Plant Science 2010, 
178:398-410. 

Weigel World Database, [http://www.weigelworld.org/resources/ 
microarray/AtGen Express/]. 

Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol 
Evol 2007, 24:1586-1591. 

Yang Z, Gu S, Wang X, Li W, Tang Z, Xu C: Molecular evolution of the 

CPP-like gene family in plants: Insights from comparative genomics of 

Arabidopsis and rice. J Mol Evol 2008, 67:266-277. 

Jiang SY, Bachmann D, La H, Ma Z, Venkatesh PN, Ramamoorthy R, 

Ramachandran S: Ds insertion mutagenesis as an efficient tool to 

produce diverse variations for rice breeding. Plant Mol Biol 2007, 

65:385-402. 



doi:1 0.1 1 86/1 471-21 64-1 2-471 

Cite this article as: Ling et al:. Genome-wide analysis of WRKY gene 
family in Cucumis sativus. BMC Genomics 201 1 12:471. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 



Submit your manuscript at 
www.biomedcentral.com/submit 



o 



BioMed Central 



